2

I have a big query table with 4 columns : name (string) , age(int) , score(float) , dob (datetime).

#!/usr/bin/env python

import json

from google.cloud import bigquery


def stream_data(json_data):
    bigquery_client = bigquery.Client("myproject")
    dataset = bigquery_client.dataset("ComputedData")
    table = dataset.table("test")
    data = json.loads(json_data)
    table.reload()
    rows = [data]
    errors = table.insert_data(rows)

    if not errors:
        print('Loaded 1 row ')
    else:
        print('Errors: {}'.format(errors))


if __name__ == '__main__':
    mynam = 'mike'
    mage = 212
    mydob='1983-09-01 00:00:00'
    mydob=None
    ds=str(mydob) if mydob else None
    myscore = 0;
    stream_data('["' + str(mynam) + '",' + str(mage) + ',"' + ds + '",'+ str(myscore) +']')

The above is a sample to test whether I can insert null values . For eg. I actually compute the score and date of birth (assume) and insert it. But if the computation does not work I want to insert null into the big query table as datetime supports null.

By default , null is None in python. However I cannot insert this into the as I cannot concat Nonetype as string . If I try to stringify the null statement as :

 ds=str(mydob) if mydob else 'null'

I get 'Invalid datetime string "null"'

I am not sure if I am generating my json the wrong way. enter image description here

I do get nulls but for that I have to leave out the field in the json.

   stream_data('["' + str(mynam) + '",' + str(mage) + ']')

I cannot do this as if I want to enter score but leave dob as null , if I insert

stream_data('["' + str(mynam) + '",' + str(mage) + ',' + str(myscore)+']')

This inserts or tries to insert score in the datetime column. So I have to insert a value for every column in the json. but want to be able to specify nulls while inserting.

1
  • Try to not insert anything rather than Null or non or 'null'; Like this: name,12,,123 (not the two commas). Ideally BQ should default to the default date. If it does not, some alternatives: - push the default date yourself (1970) - put the field as string, and write null instead (depends what you do with the date afterwards). Commented May 10, 2017 at 21:24

1 Answer 1

3

Maybe the problem here is that the insert_data method works with a list of tuples and not json.

I just created a table like yours in BQ and used the same code as yours but instead of converting to json I used the function as it's described in the docs, like so:

error = table.insert_data([('name1', 200, None, '1990-09-09 00:00:00'),
                           ('name2', 201, 67.3,)])

Among the tests I did I also tried:

error = table.insert_data([('name3', 301, 35.4,None)])

And also:

error = table.insert_data([(None, None, None,None)])

It all worked:

enter image description here

Maybe what you can do is (if your code receives for some reason the json) to first transform this json into a list of tuples and then send this list to the insert_data method with NULL values as None.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.