How to split a string multiple times in javascript

Question

I have a text file, each line is a string.

The most extreme could look like this:

A01B01C01D100E500F100.00G100.00H100.00

A little information about possibilities:

Each string will include at least one of those letters and a number
the numbers following the letters can be any number of digits and places past the decimal.
The letters are not always in order

Another example of the data:

A01B400C62.578D77.297
C62.409D77.222
C62.259D77.113
C62.135D76.975
C62.042D76.815
C61.985D76.638
C61.973D76.529
A03B10000
A0C62.760 D77.336
A0E3.000
A01F400E0
A01B400E-0.100

What I would like to do is split the string at each letter, and taking all of the numbers until the next letter. With results like so:

A01, B01, C01, D100, E500, F100.00, G100.00, H100.00

I have tried a bunch of things, and the closest I have gotten is this

dicedLine = myLine.split(/[ABCDEFGH]/)

This gives me CLOSE to what I want, except I have found that if you have a string that does not include one of those letters in the search, then the results are not what I am after.

For example a line like this:

A30

Will give me results like this:

["", "30"]

Where I would really want results like this:

["A30", "",  "",  "", "", "", "", ""]

Any ideas are appreciated!

Shall "D62.409C77.222" result in ["", "", "C77.222", "D62.409", "", "", "", ""]? Can you be more specific? — le_m
– le_m, Commented May 25, 2017 at 1:41
once the string is split, the order doesn't matter so much. I'm planning to do results.indexOf('X') to manipulate the data as needed. — Edward
– Edward, Commented May 25, 2017 at 2:18
But you still want results like ["A30", "", "", "", "", "", "", ""] instead of ["A30"] or even Set ["A30"]? — le_m
– le_m, Commented May 25, 2017 at 2:33
exactly. I want the full 'set' for each string given. Even if the set is only one actual value and 7 blanks. This is basically just cleaning up the data so I can pass it to another function and parse out what I need. — Edward
– Edward, Commented May 25, 2017 at 2:35
I don't get it. You want to perform indexOf on the results - so I assume the index is important to you. But then you say the order doesn't matter - so I assume you would be equally happy with a Set. But then you say you also want the blanks - which doesn't make sense if you are not interested in the indices... — le_m
– le_m, Commented May 25, 2017 at 2:40

Danziger · Accepted Answer · 2017-05-25 00:52:33Z

3

You can use positive lookahead to assert that a particular character exists, without actually consuming it:

codes = [
  "A01B400C62.578D77.297",
  "C62.409D77.222",
  "C62.259D77.113",
  "C62.135D76.975",
  "C62.042D76.815",
  "C61.985D76.638",
  "C61.973D76.529",
  "A03B10000",
  "A0C62.760 D77.336",
  "A0E3.000",
  "A01F400E0",
  "A01B400E-0.100",
];

console.log(codes.map(code => code.split(/ *(?=[A-Z])/)));

Note I also added * to remove the spaces, if any.

edited May 25, 2017 at 0:52

answered May 25, 2017 at 0:46

Danziger

21.4k6 gold badges59 silver badges90 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Edward Over a year ago

That was the regex I was looking for. Worked perfectly.

RobG · Accepted Answer · 2017-05-25 01:28:20Z

0

You can do this fairly easily with match rather than split. It's a simper regular expression so likely more compatible. Just beware that if no matches are found, it returns null rather than an empty array.

var data = [
 'A01B400C62.578D77.297',
 'C62.409D77.222',
 'C62.259D77.113',
 'C62.135D76.975',
 'C62.042D76.815',
 'C61.985D76.638',
 'C61.973D76.529',
 'A03B10000',
 'A0C62.760 D77.336',
 'A0E3.000',
 'A01F400E0',
 'A01B400E-0.100'
];

var result = data.map(s => s.match(/[a-z][^a-z]+/gi));

console.log(result);

answered May 25, 2017 at 1:28

RobG

148k32 gold badges180 silver badges216 bronze badges

Comments

le_m · Accepted Answer · 2017-05-25 01:40:24Z

Where I would really want results like this:

["A30", "", "", "", "", "", "", ""]

If the letters are ordered, this is easily accomplished by capturing each letter + digits in its own group as follows:

let re = /(A[^A-Z]+)?(B[^A-Z]+)?(C[^A-Z]+)?(D[^A-Z]+)?(E[^A-Z]+)?(F[^A-Z]+)?(G[^A-Z]+)?(H[^A-Z]+)?/;

console.log(re.exec("A01B01C01D100E500F100.00G100.00H100.00").slice(1));
console.log(re.exec("C30").slice(1));

Non-matching groups produce an undefined array entry. Those can easily be mapped to empty strings if desired.

The letters are not always in order

You would need to replace the individual letters in the given regexp with their generic group [A-Z].

However, your question is ambiguous about unordered letters. Shall "C30B30" result in ["", "B30", "C30", "", "", "", "", ""]? Or ["B30", "C30", "", "", "", "", "", ""]? We can't answer that part of the question without more specifications.

Collectives™ on Stack Overflow

How to split a string multiple times in javascript

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related