1

Thanks in advance.

Short:

Express JS 4.0 alters the output data, due to the Accept headers in the request. Is there a way for me to override this behaviour, and just write the same data regardless of the request headers.

When Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 is present output is changed.

Is there a way I can ignore, remove, override these headers.

Long (probably tl;dr):

I am trying to serve binary data from a Node/ExpressJS app. I am storing a compressed log file (plain/text), that has been gzipped, base64 encoded and sent to my server app, where it is being stored in a mongo database using mongoose. I know this is probably not optimal, but is currently a necessary evil. This is working fine.

$(gzip --stdout /var/log/cloud-init-script.log | base64 --wrap=0)

Is being used to compress and base64 the data, before it is sent with other data as part of a json post.

The problem occurs when I attempt to retrieve, decode the base64 encoded string and send to the browser as a binary gzip file.

// node, referring to the machine the log came from
var log = new Buffer(node.log, 'base64');

res.setHeader('Content-Disposition', 'attachment; filename=' + node.name + "-log.gz");
res.setHeader('Content-Type', 'application/x-gzip');
res.setHeader('Content-Length', log.length);

console.log(log.toString('hex'));
// res.end(log, 'binary'); I tried this hoping I could by pass, some content-negotiation
res.send(log);

I had this working when using ExpressJS 3.0 using res.send. But when I updated to ExpressJS 4.0 the downloaded data, ceased to extract properly. The data being pulled down seemingly corrupt somehow.

I started to try and fix this by comparing the downloaded file and the source file in hexidecimal output using xxd or od and found that the downloaded file was different to the source. I also dumped the hex of the NodeJS Buffer just before it is sent to the client to console, and this matches the source.

I have been banging my head against this issued for nearly a day now, and have suspected that NodeJS might be doing something funky with character encoding (UTF-8 v. Buffer v. UTF16 Strings), OS endianess.

Eventually finding none of this the be problem, I had assumed NodeJS had always been outputting the wrong data to the browser, which was correct, but it wasn't "Always" outputting the wrong data.

I had a break through, when I did a curl request to the endpoint, and the data came through as expected (matching the source), I then added the request headers that were sent with my browser requests, and got back the mangled data.

Actual log file:

I'm a log file

Good Request:

> User-Agent: curl/7.37.1
> Host: 127.0.0.1:9000
> Accept: */*
> 
< HTTP/1.1 200 OK
< X-Powered-By: Express
< Last-Modified: Tue, 26 May 2015 11:47:46 GMT
< Content-Description: File Transfer
< Content-Disposition: attachment; filename=test-log.gz
< Content-Type: application/x-gzip
< Content-Transfer-Encoding: binary
< Content-Length: 57
< Date: Tue, 26 May 2015 11:47:46 GMT
< Connection: keep-alive

0000000: 1f8b 0808 0256 6455 0003 636c 6f75 642d  .....VdU..cloud-
0000010: 696e 6974 2d73 6372 6970 742e 6c6f 6700  init-script.log.
0000020: f354 cf55 4854 c8c9 4f57 48cb cc49 e502  .T.UHT..OWH..I..
0000030: 003b 5ff5 5f0f 0000 00                   .;_._....

Bad Request:

> Host: localhost:9000
> Connection: keep-alive
> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.65 Safari/537.36
> Referer: http://localhost:9000/nodes?query=environment%3D5549b6cbdc023b5e26fe6bd4%20type%3Dnat
> Accept-Language: en-US,en;q=0.8
> 
< HTTP/1.1 200 OK
< X-Powered-By: Express
< Last-Modified: Tue, 26 May 2015 11:47:00 GMT
< Content-Description: File Transfer
< Content-Disposition: attachment; filename=test-log.gz
< Content-Type: application/x-gzip
< Content-Transfer-Encoding: binary
< content-length: 57
< Date: Tue, 26 May 2015 11:47:00 GMT
< Connection: keep-alive

0000000: 1ffd 0808 0256 6455 0003 636c 6f75 642d  .....VdU..cloud-
0000010: 696e 6974 2d73 6372 6970 742e 6c6f 6700  init-script.log.
0000020: fd54 fd55 4854 fdfd 4f57 48fd fd49 fd02  .T.UHT..OWH..I..
0000030: 003b 5ffd 5f0f 0000 00                   .;_._....
0

1 Answer 1

2
res.end(node.log, 'base64');

instead of

res.send(log);

Where node.log is the raw base64 encoded String and log was a Buffer that had decoded that string.

Bearing in mind I am using Node v0.10.38.

I ended up following the function call chain.

// I call
res.send(log);
// ExpressJS calls on http.ServerResponse
this.end(chunk, encoding); // chunk = Buffer, encoding = undefined
// NodeJS http.ServerResponse calls
res.inject(string);

At this point NodeJS appears to be treating the data as a string, which is where the buffer contents were being mangled.

This behaviour was different when the 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' header was not present, a different end(chunk, encoding) function was being called in this case, not using res.inject and not mangling the Buffer data.

I am not entirely sure where the content negotiation is happening and what is swapping in the different res.end functions, whether this is NodeJS or ExpressJS, but it would be nice to be able to control this content negotiation in some simple way.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.