1

so I have been writing a simple web server in Python, and right now I'm trying to handle multipart/form-data POST requests. I can already handle application/x-www-form-urlencoded POST requests, but the same code won't work for the multipart. If it looks like I am misunderstanding anything, please call me out, even if it's something minor. Also if you guys have any advice on making my code better please let me know as well :) Thanks!

When the request comes in, I first parse it, and split it into a dictionary of headers and a string for the body of the request. I use those to then construct a FieldStorage form, which I can then treat like a dictionary to pull the data out:

requestInfo = ''
while requestInfo[-4:] != '\r\n\r\n':
    requestInfo += conn.recv(1)

requestSplit = requestInfo.split('\r\n')[0].split(' ')
requestType = requestSplit[0]

url = urlparse.urlparse(requestSplit[1])
path = url[2] # Grab Path

if requestType == "POST":
    headers, body = parse_post(conn, requestInfo)

    print "!!!Request!!! " + requestInfo
    print "!!!Body!!! " + body 
    form = cgi.FieldStorage(headers = headers, fp = StringIO(body), environ = {'REQUEST_METHOD':'POST'}, keep_blank_values=1)

Here's my parse_post method:

def parse_post(conn, headers_string):
    headers = {}
    headers_list = headers_string.split('\r\n')

    for i in range(1,len(headers_list)-2):
        header = headers_list[i].split(': ', 1)
        headers[header[0]] = header[1]

    content_length = int(headers['Content-Length'])

    content = conn.recv(content_length)

    # Parse Content differently if it's a multipart request??

    return headers, content

So for an x-www-form-urlencoded POST request, I can treat FieldStorage form like a dictionary, and if I call, for example:

firstname = args['firstname'].value
print firstname

It will work. However, if I instead send a multipart POST request, it ends up printing nothing.

This is the body of the x-www-form-urlencoded request: firstname=TEST&lastname=rwar

This is the body of the multipart request: --070f6a3146974d399d97c85dcf93ed44 Content-Disposition: form-data; name="lastname"; filename="lastname"

rwar --070f6a3146974d399d97c85dcf93ed44 Content-Disposition: form-data; name="firstname"; filename="firstname"

TEST --070f6a3146974d399d97c85dcf93ed44--

So here's the question, should I manually parse the body for the data in parse_post if it's a multipart request?

Or is there a method that I need/can use to parse the multipart body?

Or am I doing this wrong completely?

Thanks again, I know it's a long read but I wanted to make sure my question was comprehensive

2
  • are you doing a web server implementation for any practical reason? There are many (many) good web server implementations already. try SimpleHTTPServer` in the standard library for simple needs or nginx+uwsgi for serious needs. Commented Feb 24, 2014 at 22:47
  • No, I'm just doing it to learn about what goes down on the lower levels of these libraries, though I've never heard of SimpleHTTPServer, I'll have to keep that in mind thanks! Commented Feb 24, 2014 at 22:56

2 Answers 2

2

So I solved my problem, but in a totally hacky way.

Ended up manually parsing the body of the request, here's the code I wrote:

if("multipart/form-data" in headers["Content-Type"]):
    data_list = []
    content_list = content.split("\r\n\r\n")
    for i in range(len(content_list) - 1):
        data_list.append("")

    data_list[0] += content_list[0].split("name=")[1].split(";")[0].replace('"','') + "="

    for i,c in enumerate(content_list[1:-1]):
        key = c.split("name=")[1].split(";")[0].replace('"','')
        data_list[i+1] += key + "="
        value = c.split("\r\n")
        data_list[i] += value[0]

    data_list[-1] += content_list[-1].split("\r\n")[0]

    content = "&".join(data_list)

If anybody can still solve my problem without having to manually parse the body, please let me know!

Sign up to request clarification or add additional context in comments.

Comments

0

There's the streaming-form-data project that provides a Python parser to parse data that's multipart/form-data encoded. It's intended to allow parsing data in chunks, but since there's no chunk size enforced, you could just pass your entire input at once and it should do the job. It should be installable via pip install streaming_form_data.

Here's the source code - https://github.com/siddhantgoel/streaming-form-data

Documentation - https://streaming-form-data.readthedocs.io/en/latest/

Disclaimer: I'm the author. Of course, please create an issue in case you run into a bug. :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.