1

I have a very basic script that I've been adding on functionality to bit by bit - Hardly optimized.

It's supposed to go through X brands and run an API call for each one (Each has a unique endpoint) and then save the results to a CSV.

The CSV portion worked perfectly, albeit was a tad slow. (~20 seconds for 200KB file).

However, when I went to wrap all of it within a loop for the brands it now times out. By default it'll run 2 minutes then give me an error FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

Which I learned to append --max_old_space_size=4000000 to give it excess memory (It also died at 2GB). At 4GB I didn't get the error but I was getting Windows memory leak warnings so it's VERY obvious my script is having issues. I feel like it's looping during loops, but unsure.

var http = require("https");
var qs = require("querystring");
var csvWriter = require('csv-write-stream');
var fs = require('fs');
var moment = require('moment');
let key = '';
var brands = ['REDACTED1','REDACTED2','REDACTED3','REDACTED4','REDACTED5','REDACTED6','REDACTED7','REDACTED8']
var dataRequest = function(token){
  var date = new Date()-1; //Make it yesterday
  var formattedDate = moment(date).format('YYYYMMDD');
  var yesterdayDashes = moment(date).format('YYYY-MM-DD');
  for (var k=0;k<=brands.length;k++){

    var headers = [];
    var csvBody = [];
    var csvContent = "data:text/csv;charset=utf-8,";
    var options = {
      "method": "GET",
      "hostname": "REDACTED",
      "port": "443",
      "path": "REDACTED/"+brands[k]+"/"+yesterdayDashes+"?REDACTED&access_token="+token,
      "headers": {
        "authentication": "Bearer "+token,
        "content-type": "application/json",
        "cache-control": "no-cache"
      }
    }

    var req = http.request(options, function (res) {
      var chunks = [];

      res.on("data", function (chunk) {
        chunks.push(chunk);
      });

      res.on("end", function () {
        var body = Buffer.concat(chunks);
      var object = JSON.parse(body);
      var writer = csvWriter({
        headers: ["REDACTED1", "REDACTED2", "REDACTED3", "REDACTED4", "REDACTED5", "REDACTED6", "REDACTED7", "REDACTED8", "REDACTED9"]
      })
      writer.pipe(fs.createWriteStream(brands[k] +'_' +formattedDate +'_DEMOGRAPHIC.csv'))
      for (var i=0;i<object.length; i++){
        writer.write([
          object[i].field1.REDACTED1, 
          moment(object[i].optin.REDACTED2).format('YYYYMMDD HHMMSS'), 
          object[i].field2.REDACTED1, 
          object[i].field2.REDACTED2, 
          object[i].field2.REDACTED3, 
          object[i].field2.REDACTED4, 
          object[i].field2.REDACTED5, 
          object[i].field3.REDACTED6, 
          object[i].field3.REDACTED7
        ])
      }
      writer.end()
      });
    });
    req.end();
  }
}

var authToken = function(){

  var form = qs.stringify({
    grant_type: 'client_credentials',
    client_credentials: 'client_id:client_secret',
    client_id: 'cdg-trusted-client',
    client_secret: 'REDACTED'
  })
  var options = {
    "method": "POST",
    "hostname": "REDACTED3",
    "port": null,
    "path": "REDACTED3",
    "headers": {
      "Content-Type": "application/x-www-form-urlencoded",
      "cache-control": "no-cache"
    }
  };

  var req = http.request(options, function (res) {
    var chunks = [];

    res.on("data", function (chunk) {
      chunks.push(chunk);
    });

    res.on("end", function () {
      var body = Buffer.concat(chunks);
      var json = JSON.parse(body);
      key = json['access_token'];
    });
  });

  req.write(form);
  req.end();
  dataRequest(key);
}
authToken();

I removed sensitive information but all of the logic remains. This is a script I quickly threw together, but honestly going through it I don't really see any reason that it should require so much memory. I thought perhaps it was infinite looping but testing each loops within node directly I had no issues.

The flow starts getting the bearer token, once, then passing it to the function to pull data.

While I was debating putting this in CodeReview, this code isn't technically working at all.

Update Modifying the first for loop now outputs an unidentified CSV immediately and doesn't completely anything else. However console.log(brands[k]) is outputting the appropriate files.

Update 2

My version of JS debugging is putting console.log() everywhere, and I notice once I get below the http.request init, brands[k] suddenly becomes undefined. I think this might have to due with it not passing into the function?

Update 3

My undefined issue was caused by a missing semicolon, ending the for loop early. I have rectified it, but now I have a Max Stack Trace issue again.

My question context seems to now be "How do I make this for not run async?"

1
  • I believe Node shouldn't be the choice for such big files. I would do this operation with C and execute it with Node Commented Jul 26, 2017 at 21:19

2 Answers 2

4

Here is your problem: for (var k=0;k=brands.length;k++){ You have used the operator = which is the assignment operator and put the length of brands into the variable k instead of the < operator that check if k is still smaller then brands.length.

What happen is that the expression tested in the loop is the number 8 (the length of brands array) and it is always remains 8 so you're getting into an infinite loop.

Notice that with boolean expression any value different than 0 represent the value true while 0 represent false.

This is about the infinite loop problem, also notice there is another problem with your code with passing the token to dataRequest function.

The problem is your trying to pass it immediately after the req.end(); which send the request.

At this point the response still not there and the callback function of res.on("end", ... is still not executed. basically you are passing undefined to the dataRequest function which cause your second error. simply move the call to dataRequest to happen inside the end callback like so:

res.on("end", function () {
   var body = Buffer.concat(chunks);
   var json = JSON.parse(body);
   key = json['access_token'];
   dataRequest(key);
});
Sign up to request clarification or add additional context in comments.

7 Comments

What about it? It seems fine to me. I even tried k<=brands.length just in case.
Thanks for the update. I didn't think about that. I have updated my issue - Now it is immediately outputting one single, empty, 'unidentified' file, but console.log() shows that the brands[k] is being appropriated.
@DNorthrup you want k < brands.length not <=. brands[brands.length] will be undefined. The problem Ori mentions would cause your loop to make infinite requests.
No, the undefined problem relate to the way he pass the token to dataRequest. I updated my answer to cover that too.
@OriShalom Thanks for additional details. I made the modification you recommended but I don't quite understand how the authToken function is causing it. It's only ever called onInit and not again. In addition, the dataRequest function worked successfully until I added this loop. Also if I move dataRequest(key) into the res I start getting a rejection due to 'no valid key'(401 basically)
|
0
var http = require("https");
var qs = require("querystring");
var csvWriter = require('csv-write-stream');
var fs = require('fs');
var moment = require('moment');
let key = '';
var brands = ['REDACTED1','REDACTED2','REDACTED3','REDACTED4','REDACTED5','REDACTED6','REDACTED7','REDACTED8']
var dataRequest = function(token, client){
  var date = new Date()-1; //Make it yesterday
  var formattedDate = moment(date).format('YYYYMMDD');
  var yesterdayDashes = moment(date).format('YYYY-MM-DD');
    var headers = [];
    var csvBody = [];
    var csvContent = "data:text/csv;charset=utf-8,";
    var options = {
      "method": "GET",
      "hostname": "REDACTED",
      "port": "443",
      "path": "REDACTED/"+client+"/"+yesterdayDashes+"?REDACTED&access_token="+token,
      "headers": {
        "authentication": "Bearer "+token,
        "content-type": "application/json",
        "cache-control": "no-cache"
      }
    }; // <--- This was the culprit

    var req = http.request(options, function (res) {
      var chunks = [];

      res.on("data", function (chunk) {
        chunks.push(chunk);
      });

      res.on("end", function () {
        var body = Buffer.concat(chunks);
      var object = JSON.parse(body);
      var writer = csvWriter({
        headers: ["REDACTED1", "REDACTED2", "REDACTED3", "REDACTED4", "REDACTED5", "REDACTED6", "REDACTED7", "REDACTED8", "REDACTED9"]
      })
      writer.pipe(fs.createWriteStream(client +'_' +formattedDate +'_DEMOGRAPHIC.csv'))
      for (var i=0;i<object.length; i++){
        writer.write([
          object[i].field1.REDACTED1, 
          moment(object[i].optin.REDACTED2).format('YYYYMMDD HHMMSS'), 
          object[i].field2.REDACTED1, 
          object[i].field2.REDACTED2, 
          object[i].field2.REDACTED3, 
          object[i].field2.REDACTED4, 
          object[i].field2.REDACTED5, 
          object[i].field3.REDACTED6, 
          object[i].field3.REDACTED7
        ])
      }
      writer.end()
      });
    });
    req.end();
  }
}

var authToken = function(){

  var form = qs.stringify({
    grant_type: 'client_credentials',
    client_credentials: 'client_id:client_secret',
    client_id: 'cdg-trusted-client',
    client_secret: 'REDACTED'
  })
  var options = {
    "method": "POST",
    "hostname": "REDACTED3",
    "port": null,
    "path": "REDACTED3",
    "headers": {
      "Content-Type": "application/x-www-form-urlencoded",
      "cache-control": "no-cache"
    }
  };

  var req = http.request(options, function (res) {
    var chunks = [];

    res.on("data", function (chunk) {
      chunks.push(chunk);
    });

    res.on("end", function () {
      var body = Buffer.concat(chunks);
      var json = JSON.parse(body);
      key = json['access_token'];
      for (var k=0;k<=brands.length;k++){
         var client = brands[k];
         dataRequest(key, client);

    });
  });

  req.write(form);
  req.end();

}
authToken();

In the end I have an issue using k = x as stated above, once I modified it to be k < x I had already made multiple changes to the code causing the updates above. In addition to the k = x I had the semi-colon out of place above, which was breaking my for.

I also chose to initiate one call per brand instead of initiating it within the brand. The script is now working perfectly.

I have added a secondary answer to not inflate the question and to show the entire working code in the end.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.