10

Here is my data from google bigquery to parse:

{
    u'kind': u'bigquery#queryResponse',
    u'rows': [
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'995'
                },
                {
                    u'v': u'1600'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'942'
                },
                {
                    u'v': u'1607'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'937'
                },
                {
                    u'v': u'1599'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'894'
                },
                {
                    u'v': u'1598'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'848'
                },
                {
                    u'v': u'1592'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'841'
                },
                {
                    u'v': u'1590'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'786'
                },
                {
                    u'v': u'1603'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'779'
                },
                {
                    u'v': u'1609'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'762'
                },
                {
                    u'v': u'1597'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'753'
                },
                {
                    u'v': u'1594'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'740'
                },
                {
                    u'v': u'1596'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'738'
                },
                {
                    u'v': u'1612'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'718'
                },
                {
                    u'v': u'1590'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'717'
                },
                {
                    u'v': u'1610'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'715'
                },
                {
                    u'v': u'1602'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'680'
                },
                {
                    u'v': u'1606'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'674'
                },
                {
                    u'v': u'1603'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'639'
                },
                {
                    u'v': u'1603'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'637'
                },
                {
                    u'v': u'1603'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'634'
                },
                {
                    u'v': u'1590'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'633'
                },
                {
                    u'v': u'1599'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'616'
                },
                {
                    u'v': u'1596'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'614'
                },
                {
                    u'v': u'1596'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'612'
                },
                {
                    u'v': u'1595'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'607'
                },
                {
                    u'v': u'1603'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'579'
                },
                {
                    u'v': u'1593'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'570'
                },
                {
                    u'v': u'1600'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'541'
                },
                {
                    u'v': u'1599'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'525'
                },
                {
                    u'v': u'1608'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'520'
                },
                {
                    u'v': u'1599'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'518'
                },
                {
                    u'v': u'1602'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'486'
                },
                {
                    u'v': u'1595'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'470'
                },
                {
                    u'v': u'1593'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'433'
                },
                {
                    u'v': u'1609'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'429'
                },
                {
                    u'v': u'1607'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'421'
                },
                {
                    u'v': u'1611'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'399'
                },
                {
                    u'v': u'1592'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'363'
                },
                {
                    u'v': u'0'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'353'
                },
                {
                    u'v': u'1594'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'287'
                },
                {
                    u'v': u'1609'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'106'
                },
                {
                    u'v': u'0'
                }
            ]
        },
        {
            u'f': [
                {
                    u'v': u'the'
                },
                {
                    u'v': u'57'
                },
                {
                    u'v': u'1609'
                }
            ]
        }
    ],
    u'jobReference': {
        u'projectId': u'670640819051',
        u'jobId': u'job_5bf745fcee8b470e997d8ea90f380e68'
    },
    u'jobComplete': True,
    u'totalRows': u'42',
    u'schema': {
        u'fields': [
            {
                u'type': u'STRING',
                u'name': u'word',
                u'mode': u'NULLABLE'
            },
            {
                u'type': u'INTEGER',
                u'name': u'word_count',
                u'mode': u'NULLABLE'
            },
            {
                u'type': u'INTEGER',
                u'name': u'corpus_date',
                u'mode': u'NULLABLE'
            }
        ]
    }
}

Being a Python newbee, I really have no idea about how to go about parsing this data to create a json object like below:

[
     {'count': 200, 'year': 2008},
     {'count': 240, 'year': 2010},
     {'count': 290, 'year': 2009}
]

Can any one give me any hint about how to get started?

Example

[{u'v': u'the'}, {u'v': u'995'}, {u'v': u'1600'}]

In this for the word 'the', count is 995 and year is 1600. And so it follows.

5
  • 1
    hi, Where does "year" and "count" comes from? Commented Nov 23, 2012 at 14:26
  • In your example, how are you distinguishing that {u'v': u'995'} represents count and {u'v': u'1600'} represents year? Commented Nov 23, 2012 at 14:33
  • Probably by the index in the sequence. Commented Nov 23, 2012 at 14:34
  • @BrendanWood the first one is the count and the second one is the year.Dat how it has to be Commented Nov 23, 2012 at 14:38
  • yeah. You will have to use the index number 1 to get the count and the index number 2 to get the year. Commented Nov 23, 2012 at 14:38

3 Answers 3

27

If 'Z' is your big dictionary, on 'response' you will get the structure you need.

import json

response = []
for row in z['rows']:
    for key, dict_list in row.iteritems():
        count = dict_list[1]
        year = dict_list[2]
        response.append({'count': count['v'], 'year' : year['v']})

 print json.dumps(response)

On response you will get the following:

[{'count': u'995', 'year': u'1600'},
 {'count': u'942', 'year': u'1607'},
 {'count': u'937', 'year': u'1599'},
 {'count': u'894', 'year': u'1598'},
 {'count': u'848', 'year': u'1592'},
 {'count': u'841', 'year': u'1590'},
 {'count': u'786', 'year': u'1603'},
 {'count': u'779', 'year': u'1609'},
 {'count': u'762', 'year': u'1597'},
 {'count': u'753', 'year': u'1594'},
 {'count': u'740', 'year': u'1596'},
 {'count': u'738', 'year': u'1612'},
 {'count': u'718', 'year': u'1590'},
 {'count': u'717', 'year': u'1610'},
 {'count': u'715', 'year': u'1602'},
 {'count': u'680', 'year': u'1606'},
 {'count': u'674', 'year': u'1603'},
 {'count': u'639', 'year': u'1603'},
 {'count': u'637', 'year': u'1603'},
 {'count': u'634', 'year': u'1590'},
 {'count': u'633', 'year': u'1599'},
 {'count': u'616', 'year': u'1596'},
 {'count': u'614', 'year': u'1596'},
 {'count': u'612', 'year': u'1595'},
 {'count': u'607', 'year': u'1603'},
 {'count': u'579', 'year': u'1593'},
 {'count': u'570', 'year': u'1600'},
 {'count': u'541', 'year': u'1599'},
 {'count': u'525', 'year': u'1608'},
 {'count': u'520', 'year': u'1599'},
 {'count': u'518', 'year': u'1602'},
 {'count': u'486', 'year': u'1595'},
 {'count': u'470', 'year': u'1593'},
 {'count': u'433', 'year': u'1609'},
 {'count': u'429', 'year': u'1607'},
 {'count': u'421', 'year': u'1611'},
 {'count': u'399', 'year': u'1592'},
 {'count': u'363', 'year': u'0'},
 {'count': u'353', 'year': u'1594'},
 {'count': u'287', 'year': u'1609'},
 {'count': u'106', 'year': u'0'},
 {'count': u'57', 'year': u'1609'}]

I believe its what you need. Than only use json and do a json.dumps to the response and that's it.

Sign up to request clarification or add additional context in comments.

1 Comment

There is a typo respone = [] should be response = []
6

You can easily convert python objects into JSON objects and viceversa using the module json. Foundamentally there are only 2 classes: JSONEncoder and JSONDecoder: the first turns python collections into JSON strings, the second a JSON string into a Python object.

Examples:

from json import JSONEncoder

jsonString = JSONEncoder().encode({
  "count": 222, 
  "year": 2012
})

the code above will generate a JSON string from a Python dictionary

from json import JSONDecoder

pyDictionary = JSONDecoder().decode('{"count": 222, "year": 2012}')

the code above will generate a python dictionary from a JSON string

Comments

0

Version 0.28.0 and later of the google-cloud-bigquery library use a Row class to parse rows from a table or query.

For example to print out the results from a query with a schema

[
   {
        u'type': u'STRING',
        u'name': u'word',
        u'mode': u'NULLABLE'
    },
    {
        u'type': u'INTEGER',
        u'name': u'word_count',
        u'mode': u'NULLABLE'
    },
    {
        u'type': u'INTEGER',
        u'name': u'corpus_date',
        u'mode': u'NULLABLE'
    },
]

as in your example, one could do

query = client.query('...')
rows = query.result()
for row in rows:
    # Access by column index.
    print('word: {}'.format(row[0]))
    # Access by column name.
    # The library parses the result into an integer object,
    # based on the schema.
    print('word_count: {}'.format(row['word_count']))
    # Access by column name, like an attribute.
    print('corpus_date: {}'.format(row.corpus_date))

In version 0.29.0 (not yet released as of 2017-12-04), there will be methods for keys(), values(), items(), and get(), just like a built-in dictionary object. (Added in PR #4393) So, to convert rows to a JSON-like dictionary in 0.29.0:

query = client.query('...')
rows = query.result()
for row in rows:
    row_json = dict(row.items())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.