2

My company uses a propitiatory application server in which the server side programs are written in javascript (not node.js) . This is a very initial thing and support isn't that good

Now here is my problem :

I am having to process an uploaded csv on the server side .. I am using the super answer at How can I upload files asynchronously? (passing the formdata object with jquery) and i am able to access the sent file on the server side . But how do i parse it out ?

It looks like this

------WebKitFormBoundaryU5rJUDxGnj15hIGW
Content-Disposition: form-data; name="fileToUpload"; filename="test.csv"
Content-Type: application/vnd.ms-excel

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

------WebKitFormBoundaryU5rJUDxGnj15hIGW--

I'm really confused how to handle this file with plain javascript on the server side.

Please help.

3 Answers 3

13

Best would be to use node-formidable, browserify and polyfill it. Here is a standalone parser, works both with string and raw responses. Make sure you use a modern browser for the raw-stuff.

/* 
 * MultiPart_parse decodes a multipart/form-data encoded response into a named-part-map.
 * The response can be a string or raw bytes.
 *
 * Usage for string response:
 *      var map = MultiPart_parse(xhr.responseText, xhr.getResponseHeader('Content-Type'));
 *
 * Usage for raw bytes:
 *      xhr.open(..);     
 *      xhr.responseType = "arraybuffer";
 *      ...
 *      var map = MultiPart_parse(xhr.response, xhr.getResponseHeader('Content-Type'));
 *
 * TODO: Can we use https://github.com/felixge/node-formidable
 * See http://stackoverflow.com/questions/6965107/converting-between-strings-and-arraybuffers
 * See http://www.w3.org/Protocols/rfc1341/7_2_Multipart.html
 *
 * Copyright@ 2013-2014 Wolfgang Kuehn, released under the MIT license.
*/
function MultiPart_parse(body, contentType) {
    // Examples for content types:
    //      multipart/form-data; boundary="----7dd322351017c"; ...
    //      multipart/form-data; boundary=----7dd322351017c; ...
    var m = contentType.match(/boundary=(?:"([^"]+)"|([^;]+))/i);

    if ( !m ) {
        throw new Error('Bad content-type header, no multipart boundary');
    }

    var boundary = m[1] || m[2];

    function Header_parse(header) {
        var headerFields = {};
        var matchResult = header.match(/^.*name="([^"]*)"$/);
        if ( matchResult ) headerFields.name = matchResult[1];
        return headerFields;
    }

    function rawStringToBuffer( str ) {
        var idx, len = str.length, arr = new Array( len );
        for ( idx = 0 ; idx < len ; ++idx ) {
            arr[ idx ] = str.charCodeAt(idx) & 0xFF;
        }
        return new Uint8Array( arr ).buffer;
    }

    // \r\n is part of the boundary.
    var boundary = '\r\n--' + boundary;

    var isRaw = typeof(body) !== 'string';

    if ( isRaw ) {
        var view = new Uint8Array( body );
        s = String.fromCharCode.apply(null, view);
    } else {
        s = body;
    }

    // Prepend what has been stripped by the body parsing mechanism.
    s = '\r\n' + s;

    var parts = s.split(new RegExp(boundary)),
        partsByName = {};

    // First part is a preamble, last part is closing '--'
    for (var i=1; i<parts.length-1; i++) {
      var subparts = parts[i].split('\r\n\r\n');
      var headers = subparts[0].split('\r\n');
      for (var j=1; j<headers.length; j++) {
        var headerFields = Header_parse(headers[j]);
        if ( headerFields.name ) {
            fieldName = headerFields.name;
        }
      }

      partsByName[fieldName] = isRaw?rawStringToBuffer(subparts[1]):subparts[1];
    }

    return partsByName;
}
Sign up to request clarification or add additional context in comments.

1 Comment

very useful. but has a bug. if the content of a part contains "\r\n\r\n" then it would be splittet too and the content is not the original value, but just the first part of the content, until "\r\n\r\n" var contentStartPos = parts[i].indexOf("\r\n\r\n") + 4; var partHeaders = parts[i].substr(0, contentStartPos - 4).split("\r\n"); var partContent = parts[i].substr(contentStartPos);
1

In modern browsers you can use the built-in Response.formData() method:

const multipartPayload =
`------WebKitFormBoundaryU5rJUDxGnj15hIGW\r
Content-Disposition: form-data; name="fileToUpload"; filename="test.csv"\r
Content-Type: application/vnd.ms-excel\r
\r
1
2
3
4
5
...
\r
------WebKitFormBoundaryU5rJUDxGnj15hIGW--`

const boundary = multipartPayload.slice(2, multipartPayload.indexOf('\r\n'))
new Response(multipartPayload, {
  headers: {
    'Content-Type': `multipart/form-data; boundary=${boundary}`
  }
})
  .formData()
  .then(formData => {
    // The parsed data
    console.log([...formData.entries()])
    
    // In this particular example fileToUpload is a file,
    // so you need to call one more method to read its context
    formData.get('fileToUpload')
      .text()
      .then(text => console.log(text))
  })

The \r in the multipart payload are necessary, because line breaks must be \r\n, except the sub-payloads themselves. If you have a properly formed multipart/form-data payload, you don't need to add \r.

If you want to parse an HTTP response, you can use the fetch response directly:

fetch('/formdata-response')
  .then(response => response.formData())
  .then(formData => console.log([...formData.entries()]))

Comments

0

Ok so , i managed to do this myself. I tested it out of a couple of browsers and noticed that the header was 3 lines and the footer was 1 line.

I just wrote one simple parser that splits the file by the newline character and put its into an array line by line.

This helps me in processing till now .

function doProcess()
{

var i=0;
var str = '';
var arrayList = [];

for (i=0;i<fileContent.length;i++)
{
    if (fileContent[i] !== '\n')
    {
        str += fileContent[i];
    }
    else
    {
        str = str.replace("\r","");
        if (trim(str) !== "")
        {
            arrayList.push(str);
        }
        str = '';
    }
}

    // Remove header
    arrayList.splice(0,3);

    // Remove footer
    arrayList.splice(arrayList.length-1,1);

    // arrayList is an array of all the lines in the file
    console.log(arrayList); 
}

1 Comment

just want to say i like your approach

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.