3

I have a working bit of PHP code that uploads a binary to a remote server I don't have shell access to. The PHP code is:

function upload($uri, $filename) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $uri);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, array('file' => '@' . $filename));
curl_exec($ch);
curl_close($ch);
}

This results in a header like:

HTTP/1.1
Host: XXXXXXXXX
Accept: */*
Content-Length: 208045596
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------360aaccde050

I'm trying to port this over to python using requests and I cannot get the server to accept my POST. I have tried every which way to use requests.post, but the header will not mimic the above.

This will successfully transfer the binary to the server (can tell by watching wireshark) but because the header is not what the server is expecting it gets rejected. The response_code is a 200 though.

files = {'bulk_test2.mov': ('bulk_test2.mov', open('bulk_test2.mov', 'rb'))}
response = requests.post(url, files=files)

The requests code results in a header of:

HTTP/1.1
Host: XXXX
Content-Length: 160
Content-Type: multipart/form-data; boundary=250852d250b24399977f365f35c4e060
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.2.1 CPython/2.7.5 Darwin/13.1.0

--250852d250b24399977f365f35c4e060
Content-Disposition: form-data; name="bulk_test2.mov"; filename="bulk_test2.mov"


--250852d250b24399977f365f35c4e060--

Any thoughts on how to make requests match the header that the PHP code generates?

3
  • 2
    Rejected and the response code is 200? Presumably an error message page is returned? Commented Apr 11, 2014 at 17:05
  • 2
    I notice that your Content-Length is only 160 bytes.. That's exactly the size of the multipart boundaries and metadata plus newlines. Your file appears to be empty. Commented Apr 11, 2014 at 17:06
  • So if I use ` res = requests.post(url, data=open_file, headers={'Content-Type':'multipart/form-data; boundary=----------------------------360aaccde050'})` I get Content-Length: 208045390, which is accurate. But the header is again different than what the server is expecting Commented Apr 11, 2014 at 17:16

1 Answer 1

6

There are two large differences:

  1. The PHP code posts a field named file, your Python code posts a field named bulk_test2.mov.

  2. Your Python code posts an empty file. There Content-Length header is 160 bytes, exactly the amount of space the multipart boundaries and Content-Disposition part header take up. Either the bulk_test2.mov file is indeed empty, or you tried to post the file multiple times without rewinding or reopening the file object.

To fix the first problem, use 'file' as the key in your files dictionary:

files = {'file': open('bulk_test2.mov', 'rb')}
response = requests.post(url, files=files)

I used just the open file object as the value; requests will get the filename directly from the file object in that case.

The second issue is something only you can fix. Make sure you don't reuse files when repeatedly posting. Reopen, or use files['file'].seek(0) to rewind the read position back to the start.

The Expect: 100-continue header is an optional client feature that asks the server to confirm that the body upload can go ahead; it is not a required header and any failure to post your file object is not going to be due to requests using this feature or not. If an HTTP server were to misbehave if you don't use this feature, it is in violation of the HTTP RFCs and you'll have bigger problems on your hands. It certainly won't be something requests can fix for you.

If you do manage to post actual file data, any small variations in Content-Length are due to the (random) boundary being a different length between Python and PHP. This is normal, and not the cause of upload problems, unless your target server is extremely broken. Again, don't try to fix such brokenness with Python.

However, I'd assume you overlooked something much simpler. Perhaps the server blacklists certain User-Agent headers, for example. You could clear some of the default headers requests sets by using a Session object:

files = {'file': open('bulk_test2.mov', 'rb')}
session = requests.Session()
del session.headers['User-Agent']
del session.headers['Accept-Encoding']
response = session.post(url, files=files)

and see if that makes a difference.

If the server fails to handle your request because it fails to handle HTTP persistent connections, you could try to use the session as a context manager to ensure that all session connections are closed:

files = {'file': open('bulk_test2.mov', 'rb')}
with requests.Session() as session:
    response = session.post(url, files=files, stream=True)

and you could add:

response.raw.close()

for good measure.

Sign up to request clarification or add additional context in comments.

8 Comments

So the above creates a header like Content-Length: 208045540 Content-Type: multipart/form-data; boundary=bd19b64db83e4ebbaadc4835f9727856 Accept-Encoding: gzip, deflate, compress Accept: / User-Agent: python-requests/2.2.1 CPython/2.7.5 Darwin/13.1.0 --bd19b64db83e4ebbaadc4835f9727856 Content-Disposition: form-data; name="file"; filename="bulk_test2.mov" Which is different from what I need (shown below) Content-Length: 208045596 Expect: 100-continue Content-Type: multipart/fo...
Don't fixate too much on precise headers! Expect: 100-continue is not a required header, and the only thing that does is pause uploading the body until the server says it is fine to upload the body. If your HTTP upload fails because that header is missing your server has bigger problems than just this post.
And the 16 bytes content length difference is entirely explained by the fact that the (correctly generated) multipart boundaries differ by 8 characters between PHP and Python. That is never going to be the cause of the problem.
The only reason I'm so focused on the headers is the destination seems to rely on them for logic in their code after the POST completed. If my header is not identical to the PHP generated one, the server will not process the file.
@user3524641: set keep_alive to False on the session then.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.