0

I have a (private) blob in Azure blob storage that was written through an account that has write and read access to it (it was written through this account by terraform). I am trying to fetch it through Python (without Azure SDK) and I have been unable to.

My request is as follows:

import datetime
import requests


key = ...
secret = ...
now = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
# the required settings, as per https://learn.microsoft.com/en-us/rest/api/storageservices/get-blob
headers = {'Authorization': 'SharedKey {}:{}'.format(key, secret),
           'Date': now,
           'x-ms-version': '2018-03-28'
           }

storage_account = ...
container = ...
url = 'https://{}.blob.core.windows.net/{}/terraform.tfstate'.format(storage_account, container)

response = requests.get(url, headers=headers)

print(response.status_code)
print(response.text)

This yields

400
<?xml version="1.0" encoding="utf-8"?><Error>
<Code>OutOfRangeInput</Code><Message>One of the request inputs is out of range. 
RequestId:...
Time:...</Message></Error>

I have validated that this file exists (Storage explorer) and that, when I access it via the console, I get the same URL as the one above, but with extra GET parameters.


For those wondering: the reason I decided not to use Azure SDK for Python: I only need to get a blob and pip install azure[blob] would add 88 dependencies to the project (IMO unacceptably high number for such a simple task).

1 Answer 1

1

So, the reason is that the signature mentioned in the documentation is constructed from the request and is described here in detail.

The Python 3-equivalent of the whole thing is:

import base64
import hmac
import hashlib
import datetime

import requests


def _sign_string(key, string_to_sign):
    key = base64.b64decode(key.encode('utf-8'))
    string_to_sign = string_to_sign.encode('utf-8')
    signed_hmac_sha256 = hmac.HMAC(key, string_to_sign, hashlib.sha256)
    digest = signed_hmac_sha256.digest()
    encoded_digest = base64.b64encode(digest).decode('utf-8')
    return encoded_digest


def get_blob(storage_account, token, file_path):
    now = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
    url = 'https://{account}.blob.core.windows.net/{path}'.format(account=storage_account, path=file_path)
    version = '2018-03-28'
    headers = {'x-ms-version': version,
               'x-ms-date': now}

    content = 'GET{spaces}x-ms-date:{now}\nx-ms-version:{version}\n/{account}/{path}'.format(
        spaces='\n'*12,
        now=now,
        version=version,
        account=storage_account,
        path=file_path
    )

    headers['Authorization'] = 'SharedKey ' + storage_account + ':' + _sign_string(token, content)

    response = requests.get(url, headers=headers)

    assert response.status_code == 200
    return response.text

where file_path is of the form {container}/{path-in-container}.

Using this snippet was still superior to add 88 dependencies to the project.

Sign up to request clarification or add additional context in comments.

3 Comments

where have you found the documentation for the format of the GET{spaces}x-ms-date:{now}\nx-ms-version:{version}\n/{account}/{path}?
@Patrick, on the (Python) source code of the azure CLI. Just reversed engineered it.
I was lucky to find this post. I tried the same in powershell and had some troubles with the correct format and the needed thing. There is nothing written (or I was not able to find it) at the REST API documentation...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.