0

I am scraping data from a URL containing a csv. The format of the data I'm scraping is like this:

I am doing this in Node.js and using the nodejs-requestify package: https://www.npmjs.com/package/requestify

When I console the response.getBody() it's in the exact same format as the screenshot provided.

I am trying to convert this to a JSON array that I can iterate over in a loop to insert the values into a database, however, am struggling to get the data into JSON format.

I've tried splitting the array in multiple ways (comma, single quote, double quote). I've tried JSON.parse() and JSON.stringify() (and both in combination).

Here is the code I'm using. Ultimately when I console.log rows in the loop, this is where it should be in JSON format, however, it's just coming in as comma separated values still.

requestify.get('URL').then(function(response) {
  // Get the response body
  var dataBody = response.getBody();
  var lineArray = dataBody.split('\r\n');
  var data = JSON.parse(JSON.stringify(lineArray));

  for(var s = 0; s < data.length; s++) {
    var rows = data[s];
    console.log(rows)
  }
});
5
  • In general it's not safe to assume that splitting on line breaks will give you the rows of a CSV. With quoted fields a line break can appear within a field. I strongly recommend using an existing CSV module like csv-parse or csv-parser which will handle the edge cases for you. Commented May 24, 2019 at 16:16
  • 1
    Possible duplicate of How to convert CSV to JSON in Node.js Commented May 24, 2019 at 16:18
  • OK I will look into csv-parse Commented May 24, 2019 at 16:21
  • So the lineArray gives you the value you have shown in the image ? One more thing is you are doing JSON.stringify the JSON.parse which is not correct. Commented May 24, 2019 at 17:24
  • I solved this. Posted as an answer. Commented May 24, 2019 at 20:39

2 Answers 2

0

There is a basic misundertanding I think

var lineArray = dataBody.split('\r\n');

lineArray now contains something like

"a", "b", "c"

but for doing something like

var data = JSON.parse(lineArray);

you need lineArray to be

{ "a":"1", "b":"2", "c":"3" }

I think you need something like

const lineData = lineArray.split(',');
const keys = ["name", "age", "gender"];
const jsonLineData = {};
keys.forEach((key, index) => {
    jsonLineData[key] = lineData(index);
});
Sign up to request clarification or add additional context in comments.

2 Comments

To confirm, in the keys array I put the headers of my csv?
This did not work for me. Maybe you can elaborate on this code?
0

I solved this by using csvtojson and aws-dsk since my csv is hosted on S3.

async function startAWS(db){
  //Retrieve AWS IAM credentials for the 'master' user
  var awsCredentials;
  try{ 
    awsCredentials = await retrievePromise(config.get('aws')); 
  }
  catch (e) { 
    console.log({error:e},'startAWS error'); 
  }
  //Setup the AWS config to access our S3 bucket
  AWS.config = new AWS.Config({
    accessKeyId :  awsCredentials.principal,
    secretAccessKey :awsCredentials.credential,
    region:'us-east-1'
  });
  //Call S3 and specify bucket and file name
  const S3 = new AWS.S3();
  const params = {
    Bucket: '***',
    Key: '***' //filename
  };
  //Convert csv file to JSON
  async function csvToJSON() {
  // get csv file and create stream
  const stream = S3.getObject(params).createReadStream();
  // convert csv file (stream) to JSON format data
  const json = await csv().fromStream(stream);

  //connect to DB and continue script
  db.getConnection()
  .then(async (conn) => {
    if(json.length) {
      for(var s = 0; s < json.length; s++) {
        var rows = json[s];

        const insert = await conn.query(
          'SQL HERE'
        );
      }
    }
  })
  };
  csvToJSON();
}


Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.