3

I have an issue with cosmos SQL db python sdk and I have no idea how to fix it.

I have a data explorer with some data in. And I am using python sdk to query this data and save the output in a json file. So far everything works just fine. But I wanted to take it to the next step, and rather than saving this query result into a json file, I would like to pass this query result directly to a cosmosdb to be stored.

and here is the main problem.

I followed the guide about azure-cosmos. connected to my cosmosdb and I am able to connect using python.

Than I used this block of code:

######################################################
##                   COSMOS-DB                      ##
######################################################

url = "<my-url>"
key = "my-key"
client = CosmosClient(url, key)
database_name = "My-Database"
container_name = "Table"
database = client.get_database_client(database_name)
container = database.get_container_client(container_name)
data = json.dumps(str(df))
data_dict = json.loads(data)
print(data_dict)
container.create_item(body=str(data_dict))

the df is a data frame which was giving me problems, so I parsed it to a dictionary.

but when I try to use the container.createitem(body=data_dict)

I get this error:

Traceback (most recent call last):
  File "query.py", line 72, in <module>
    container.create_item(body=data_dict)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/core/tracing/decorator.py", line 83, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/container.py", line 511, in create_item
    result = self.client_connection.CreateItem(
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/_cosmos_client_connection.py", line 1084, in CreateItem
    options = self._AddPartitionKey(database_or_container_link, document, options)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/_cosmos_client_connection.py", line 2512, in _AddPartitionKey
    partitionKeyValue = self._ExtractPartitionKey(partitionKeyDefinition, document)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/_cosmos_client_connection.py", line 2526, in _ExtractPartitionKey
    return self._retrieve_partition_key(partition_key_parts, document, is_system_key)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/_cosmos_client_connection.py", line 2539, in _retrieve_partition_key
    partitionKey = partitionKey.get(part)
AttributeError: 'str' object has no attribute 'get'

I am totally lost at this point and I don't understand how to solve this issue.

UPDATE: this is the data I am trying to pass to cosmos:

[
  {
    "_timestamp": 1622036400000,
    "name": "User Log Off",
    "message": "message",
    "userID": "userID",
    "Events": "SignOff event",
    "event_count": 1
  },
  {
    "_timestamp": 1622035800000,
    "name": "User Log Off",
    "message": "message",
    "userID": "userID",
    "Events": "SignOff event",
    "event_count": 1
  }
]

those are just 2 samples of the whole array, they are around 300

I fixed the previous error.

Now I have a proper json file being dumps. Which it looks like the one previously posted. I run the container.create_item(item) but I got this error:

azure.cosmos.exceptions.CosmosHttpResponseError: (BadRequest) Message: {"Errors":["The input content is invalid because the required properties - 'id; ' - are missing"]}

I was confident that cosmos will add the id automatically

7
  • Can you edit your question and provide how your input data looks like? Commented Jun 3, 2021 at 14:58
  • Is this what is getting passed to create_item method? Or in other words, is your data_dict is an array? Commented Jun 3, 2021 at 15:37
  • Also, what's the partition key for the container that you created? Commented Jun 3, 2021 at 15:38
  • I am so sorry mate. I just realised that my IDE wast running properly. I restarted and dumped the json into a file. and what I found is that I am passing to create_item, is ONE string with all the objects inside. I am even more confused now. My partitionKey is /_timestamp. I am so sorry to bother you with this issue but I am completely new to this. Commented Jun 3, 2021 at 15:46
  • 1
    What you have to do is loop through each item in the array (data_dict) and save each item separately. Commented Jun 3, 2021 at 15:49

2 Answers 2

2

Considering your data_dict is an array of items, what you would want to do is loop through this array and save each item separately.

Please try this code:

import uuid

url = "<my-url>"
key = "my-key"
client = CosmosClient(url, key)
database_name = "My-Database"
container_name = "Table"
database = client.get_database_client(database_name)
container = database.get_container_client(container_name)
data = json.dumps(str(df))
data_dict = json.loads(data)
print(data_dict)
#Loop through each item in your "data_dict" array.
for item in data_dict:
    #Assign id to the item
    item['id'] = str(uuid.uuid4())
    print(item)
    container.create_item(body=item)
Sign up to request clarification or add additional context in comments.

2 Comments

You are an absolute legend mate. Thank you so much. It does work. you are amazing. Thank you thank you thank you
I am glad to hear that it worked out well for you :).
0

When creating a new item using container.create_item(body=data), the data dictionary must include the id key. To avoid conflicts and automatically generate a random ID, set the enable_automatic_id_generation parameter to True:

container.create_item(body=data, enable_automatic_id_generation=True)

This allows Cosmos DB to handle ID assignment automatically.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.