2

I am trying to store the following into a pandas dataframe:

{    
"page": 1,
      "results": [
        {
          "poster_path": null,
          "adult": false,
          "overview": "Go behind the scenes during One Directions sell out \"Take Me Home\" tour and experience life on the road.",
          "release_date": "2013-08-30",
          "genre_ids": [
            99,
            10402
          ],
          "id": 164558,
          "original_title": "One Direction: This Is Us",
          "original_language": "en",
          "title": "One Direction: This Is Us",
          "backdrop_path": null,
          "popularity": 1.166982,
          "vote_count": 55,
          "video": false,
          "vote_average": 8.45
        },
        {
          "poster_path": null,
          "adult": false,
          "overview": "",
          "release_date": "1954-06-22",
          "genre_ids": [
            80,
            18
          ],
          "id": 654,
          "original_title": "On the Waterfront",
          "original_language": "en",
          "title": "On the Waterfront",
          "backdrop_path": null,
          "popularity": 1.07031,
          "vote_count": 51,
          "video": false,
          "vote_average": 8.19
         }
           etc....
           etc.....
      ],
      "total_results": 61,
      "total_pages": 4
    }

What is the simplest way to store all attributes of each result in a pandas dataframe?

I am storing the json objects in a dict variable.Do i realy need to iterate through the result block stored in my dict variable and define each field for each pandas column?

This what i am trying to avoid:

columns = ['filmid', 'title'....... ]

# create dataframe 
df = pandas.DataFrame(columns=columns)


for film in films:

    df.loc[len(df)]=[film['id'],title['title']........................] 
2
  • ctrl+f films, 1 match. NameError. Commented Jan 23, 2020 at 19:59
  • 1
    Does this answer your question? JSON to pandas DataFrame Commented Jan 23, 2020 at 22:32

1 Answer 1

3

You can possibly use the json_normalize function. Here is an example:

import json
from pandas.io.json import json_normalize

#Load data
with open('yourfile.json') as file:
        data = json.load(file)

flat_json_df = json_normalize(data['results'])

For me, when used on the data above, it results in the following dataframe:

adult   backdrop_path   genre_ids   id  original_language   original_title  overview    popularity  poster_path release_date    title   video   vote_average    vote_count
0   False   None    [99, 10402] 164558  en  One Direction: This Is Us   Go behind the scenes during One Directions sel...   1.166982    None    2013-08-30  One Direction: This Is Us   False   8.45    55
1   False   None    [80, 18]    654 en  On the Waterfront       1.070310    None    1954-06-22  On the Waterfront   False   8.19    51
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.