4

Situation

I am using HttpClient (System.Net.Http, Version=4.2.1.0) to POST an HTTP request with multipart form data to a web API. The form data includes a string parameter (benchmark) and a file (addressFile) which is contained in stream. The API call returns a CSV file which I want to save to disk.

The response contains the header Transfer-Encoding: chunked and the data contained in responseBytes includes the chunk headers. I would expect the HttpClient library to strip out these headers, which are metadata for the actual content. Instead, it simply includes the header rows in the Content.

Question

What is the correct way to handle these chunk headers?

I could of course write a method to handle the headers myself, but I find it hard to believe that the HttpClient library doesn't already have this functionality baked in somewhere.

Code

using (var client = new HttpClient())
        {
            var content = new MultipartFormDataContent();
            content.Add(new StringContent("Public_AR_Current"), "benchmark");
            content.Add(new ByteArrayContent(stream.ToArray()), "addressFile", "addressFile.csv");

            var response = await client.PostAsync("https://geocoding.geo.census.gov/geocoder/locations/addressbatch", content);

            var responseBytes = await response.Content.ReadAsByteArrayAsync();
            saveResponse(responseBytes);

            var geocodedItems = ParseGeocodeResponse(responseBytes);
            var parsedItems = geocodedItems.Select(gi => gi.ToEpaHandlerUsCensusGeocode());
            return parsedItems;
        }

Result

Note the chunk header on the first and subsequent lines (0fe8, 0060, 0fe8).

0fe8
0fe8
"AK0000036228","500 HOLLYWOOD DR, ANCHORAGE, AK, 99501","Match","Exact","500 HOLLYWOOD DR, ANCHORAGE, AK, 99501","-149.87424,61.23034","190797469","R"
"AK0000363994","3155 E 18TH CIR, ANCHORAGE, AK, 99508","Match","Non_Exact","3155 E 18TH CIR, ANCHORAGE, AK, 99508","-149.82193,61.20462","190799569","L"
...
0060
28712","N 65 DEG 35 15 W 167 DEG 55 18, WALES, AK, 99734","No_Match"
"AK0000112227","KODIAK ARPR
...
0fe8
T AREA, KODIAK, AK, 99615","No_Match"
"AK0000033902","2130 E DIMOND BLVD, ANCHORAGE, AK, 99515","Match","Non_Exact","2130 W DIMOND BLVD, ANCHORAGE, AK, 99515","-149.91881,61.1375","190795925","L"
"AK0000562769","3100 TONGASS AVE, KETCHIKAN, AK, 99901-5746","No_Match"

Expected Result

I would expect headers to be stripped out by HttpClient library.

"AK0000036228","500 HOLLYWOOD DR, ANCHORAGE, AK, 99501","Match","Exact","500 HOLLYWOOD DR, ANCHORAGE, AK, 99501","-149.87424,61.23034","190797469","R"
"AK0000363994","3155 E 18TH CIR, ANCHORAGE, AK, 99508","Match","Non_Exact","3155 E 18TH CIR, ANCHORAGE, AK, 99508","-149.82193,61.20462","190799569","L"
"AK0000228718","1050 ASPEN ST, FAIRBANKS, AK, 99709-5501","Match","Exact","1050 ASPEN ST, FAIRBANKS, AK, 99709","-147.7731,64.8535","605310042","L"
"AK0000536714","SMITH COVE IN SMITH LAGOON T74S R86E CRM S17 & 20, KASAAN, AK, 99901","No_Match"
"AK0001413822","USS-12403, N BANK WOOD RIVER, ALEKNAGIK, AK, 99555","No_Match"
"AK0000489567","BREAKWATER BTWN WESTERN AVE & TAIT ST, METLAKATLA, AK, 99926","No_Match"

1 Answer 1

6

I ended up writing this extension method which performs sufficiently well for my use case.

    public static Task<Stream> ReadAsStreamAsync(this HttpContent content, bool isChunked)
    {
        if (!isChunked)
        {
            return content.ReadAsStreamAsync();
        }
        else
        {
            var task = content.ReadAsStreamAsync()
            .ContinueWith<Stream>((streamTask) =>
            {
                var outputStream = new MemoryStream();
                var buffer = new char[1024 * 1024];
                var stream = streamTask.Result;

                // No using() so that we don't dispose stream.
                var tr = new StreamReader(stream);
                var tw = new StreamWriter(outputStream);

                while (!tr.EndOfStream)
                {
                    var chunkSizeStr = tr.ReadLine().Trim();
                    var chunkSize = int.Parse(chunkSizeStr, System.Globalization.NumberStyles.HexNumber);

                    tr.ReadBlock(buffer, 0, chunkSize);
                    tw.Write(buffer, 0, chunkSize);
                    tr.ReadLine();
                }

                return outputStream;
            });

            return task;
        }


    }
Sign up to request clarification or add additional context in comments.

1 Comment

Hey @MattEvansDev it's weird, to me it's heppening the other way around, using HttpClient or HttpWebRequest the content is already parsed, tested with curl curl -ivs --raw -v http://www.httpwatch.com/httpgallery/chunked/chunkedimage.aspx and it's chunked, I have tested .NET 2.0, 4.7 (System.Net.Http, Version=4.0.0.0) and .NET 8.0 all these versions I don't need to parse the chunked content, I mean it's cool that I don't have to do it but I don't understand why, also I tested a socket version sending and receiving raw content and the content it's chunked, so I conluded? that it's parsed

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.