1

Is there a way to split a CSV string with javascript where the separator can also occur as an escaped value. Other regex implementations solve this problem with a lookbehind, but since javascript does not support lookbehind I wonder how I could accomplish this in a neatly fashion using a regex expression.

A csv line might look like this

"This is\, a value",Hello,4,'This is also\, possible',true

This must be split into (strings containing)

[0] => "This is\, a value"
[1] => Hello
[2] => 4
[3] => 'This is also\, possible'
[4] => true
2
  • 5
    Possible duplicate Commented Oct 6, 2013 at 10:56
  • Yes and no. I am explicitly looking for a clean regex that can solve my problem. Commented Oct 10, 2013 at 8:29

4 Answers 4

1

Instead of trying to split you can try a global match for all that is not a , with this pattern:

/"[^"]+"|'[^']+'|[^,]+/g
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you Casimir. This will not work when you want to split the CSV line. But you are absolutely right, it does give the expected output. I could group this expression to get the result, but the end result will also include the comma. Personally I do find this solution a little bit better than (.*?[^\])(,|$) since the result excludes the comma.
0

for example you can use this regex:

(.*?[^\\])(,|$)

regex takes everything .*? until first comma, which does not have \ in front of it, or end of line

3 Comments

Just a suggestion, if you add an explanation of what this regex does it will be more easier for OP to understand and use.
will do next time too. Thanks @Harry
Yes, this works, but the result does include the comma. If I split the CSV line with this regex I'm pretty close to the result.
0

Here's some code that changes csv to json (assuming the first row it prop names). You can take the first part (array2d) and do other things with it very easily.

// split rows by \r\n.  Not sure if all csv has this, but mine did
const rows = rawCsvFile.split("\r\n");

// find all commas, or chunks of text in quotes.  If not in quotes, consider it a split point
const splitPointsRegex = /"(""|[^"])+?"|,/g;
const array2d = rows.map((row) => {
    let lastPoint = 0;
    const cols: string[] = [];
    let match: RegExpExecArray;
    while ((match = splitPointsRegex.exec(row)) !== null) {
        if (match[0] === ",") {
            cols.push(row.substring(lastPoint, match.index));
            lastPoint = match.index + 1;
        }
    }
    cols.push(row.slice(lastPoint));

    // remove leading commas, wrapping quotes, and unneeded \r
    return cols.map((datum) => 
        datum.replace(/^,?"?|"$/g, "")
        .replace(/""/g, `\"`)
        .replace(/\r/g, "")
    );
})

// assuming first row it props name, create an array of objects with prop names of the values given
const out = [];
const propsRow = array2d[0];
array2d.forEach((row, i) => {
    if (i === 0) { return; }
    const addMe: any = {};
    row.forEach((datum, j) => {
        let parsedData: any;
        if (isNaN(Number(datum)) === false) {
            parsedData = Number(datum);
        } else if (datum === "TRUE") {
            parsedData = true;
        } else if (datum === "FALSE") {
            parsedData = false;
        } else {
            parsedData = datum;
        }
        addMe[propsRow[j]] = parsedData;
    });
    out.push(addMe);
});

console.log(out);

Comments

0

Unfortunately this doesn't work with Firefox, only in Chrome and Edge:

"abc\\,cde,efg".split(/(?<!\\),/) will result in ["abc\,cde", "efg"].

You will need to remove all (unescaped) escapes in a second step.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.