0

I've been looking around the web to append data to an existing JSON file in azure storage, I also check on this post, but it didn't help. I have millions of JSON records coming in real-time which are available in python list and I want to append those JSON records to an existing JSON file in azure blob. Though my main data source is KafkaConsumer, and I'm consuming data from Kafka topic and I want that data into azure storage as JSON format. As, I'm using python and I don't want to read/write on my local hard disk, I just want like if I have list of JSON records I can directly append to JSON file which already in azure container. Can anyone help me out or give some references, it will be pleasure for me. Thanks

7
  • Are you using the append blobs? Commented Dec 10, 2021 at 11:26
  • In start I only upload JSON file using upload_blob function and then I tried append_block function on this JSON file, but it gives authentication error as this . "ErrorCode:AuthenticationFailed Error:None AuthenticationErrorDetail:The MAC signature found in the HTTP request 'hW87ugUtVXulSjA4ZpI6jc6vLU+tjj4KKM7/uWE2J6w=' is not the same as any computed signature. Server used following string to sign: 'PUT 1043 application/octet-stream" Commented Dec 11, 2021 at 17:47
  • How you doing the authentication for azure storage? Commented Dec 12, 2021 at 3:43
  • I'm doing authentication when I create conection and it didn't give any issue when I create connection. I only create BlobServiceClient connection using account _url and account key after this I'm not doing any authentication but when I append to blob it gives authentication issue. Commented Dec 13, 2021 at 5:32
  • can you please edit question with code you tried ? Commented Dec 13, 2021 at 12:55

1 Answer 1

2

I tried in my system able to append the data to existing file, I taken the dummy json data for testing purpose you can pass the your json data

from azure.storage.blob import AppendBlobService
import json

def append_data_to_blob(data):
  service = AppendBlobService(account_name="appendblobex", 
            account_key="key")
  data1 = {}
  data1['hi'] = 'hello'
  json_data = json.dumps(data1)
  data = json.dumps(data1)
  print(data1)
  try:
    service.append_blob_from_text(container_name="test", blob_name="test1", text = data)
  except:
     #To create the blob and append data
    #service.create_blob(container_name="test", blob_name="test1")
    service.append_blob_from_text(container_name="test", blob_name="test1", text = data)
  print('Data Appended to Blob Successfully.')


append_data_to_blob("data")

OUTPUT

enter image description here

Data appended in the azure storage file after download open the file and view the data

enter image description here

Sign up to request clarification or add additional context in comments.

5 Comments

I'm not able to import AppendBlobService, it is showing Module not found.
try with pip install azure-storage --upgrade and again try with import statement
If the answer is helpful for you, you can accept it as answer( click on the check mark beside the answer to toggle it from greyed out to filled in.). This can be beneficial to other community members. Thank you
Okay thanks @ShrutiJoshi-MT for your cooperation, definitely it was very helpful for me.
@ShrutiJoshi-MT can we access this appendblob from databricks. As per the MS documention, we cannot read the append blobs in databricks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.