1

I have the following string

Name=(Last, First), Age=(31 year, 6 months, 3 day), Height= 6.1 ft, Employment=None, Email Address =/NA/, Mobile=XXXX

and I would like to split them into the following to build a dictionary

Name: "(Last, First)"
Age: "(31 year, 6 months, 3 day)"
Height: " 6.1 ft"
...

I came across this post and tried to tweak it but can get it work with keys with/out "()". Here is the code or from this link. Would appreciate your help and feel free to suggest easier or alternative way.

txt="Name=(Last, First), Age=(31 year, 6 months, 3 day), Height= 6.1 ft, Employment=None, Email Address =/NA/, Mobile=XXXX"

//var r = /.+?=.+?(?=(\([^)]+\),)|(.=)|$)/g;
//var r = /.+?=.+?(?=(=(.*),)|$)/g;

var r = /.+?=.+?(?=(\),)|(\d\b,)|$)/g;

var obj = txt.match(r).reduce(function(obj, match) {
    var keyValue = match.split("=");
    obj[keyValue[0].replace(/,\s/g, '')] = keyValue[1];
    return obj;
}, {});
console.log(obj);

2 Answers 2

1

Here is one way, using a smart lookahead to split conditionally on only the correct commas:

var txt = "Name=(Last, First), Age=(31 year, 6 months, 3 day), Height= 6.1 ft, Employment=None, Email Address =/NA/, Mobile=XXXX"
var map = {};
var parts = txt.split(/,\s*(?![^(]*\))/);
parts.forEach(function (item, index) {
    map[item.split(/=/)[0]] = item.split(/=/)[1];
});
console.log(map);

The regex used for splitting requires an explanation:

,\s*         match comma with optional whitespace
(?![^(]*\))  assert that we cannot lookahead and see a closing ) without
             first seeing an opening (

The lookahead condition prevents matching commas which happen to be inside (...) terms. Then, once we have the array of terms, we iterate and build the dictionary you want, splitting each term on = to find the key and value.

Sign up to request clarification or add additional context in comments.

2 Comments

This is great and easier to follow. Thanks for the help.
I accepted yours. I got two great answers with great explanations. Yours is easier to understand and gives more flexibility once split at the correct commas, e.g. further split inside parentheses.
1

To match the properties and values, you can use:

(\w+)\s*=\s*(\([^)]+\)|[^,]+)
  • (\w+) - Match and capture the property (one or more word characters)
  • \s*=\s* - Followed by optional spaces, a =, and more optional spaces
  • (\([^)]+\)|[^,]+) - Match and capture either:
    • \([^)]+\) - A (, followed by non-) characters, followed by ), OR
    • [^,]+ - Non-comma characters

const str = 'Name=(Last, First), Age=(31 year, 6 months, 3 day), Height= 6.1 ft, Employment=None, Email Address =/NA/, Mobile=XXXX';

const obj = {};
for (const [, prop, val] of str.matchAll(/(\w+)\s*=\s*(\([^)]+\)|[^,]+)/g)) {
  obj[prop] = val;
}
console.log(obj);

If the input keys may contain spaces as well, match anything but a =:

const str = 'Employee Name=(Last, First), Person Age=(31 year, 6 months, 3 day), Height= 6.1 ft, Employment=None, Email Address =/NA/, Mobile=XXXX';

const obj = {};
for (const [, prop, val] of str.matchAll(/(\w[^=]+)\s*=\s*(\([^)]+\)|[^,]+)/g)) {
  obj[prop] = val;
}
console.log(obj);

If you can't use matchAll, then iterate over the matches manually with exec:

const str = 'Name=(Last, First), Age=(31 year, 6 months, 3 day), Height= 6.1 ft, Employment=None, Email Address =/NA/, Mobile=XXXX';

const obj = {};
const pattern = /(\w+)\s*=\s*(\([^)]+\)|[^,]+)/g;
let match;
while (match = pattern.exec(str)) {
  const [, prop, val] = match;
  obj[prop] = val;
}
console.log(obj);

4 Comments

Very nice. Thanks for taking the time to explain the regex syntax. One small note is if the key is multi words like "Employee Name", it will only pick "Name". I changed it to /\w+\s*\w+\s*=\s*(\([^)]+\)|[^,]+) and it worked (same with (\w+\s*\w+)). It was easier after you explained it. However, I'm not sure what the parenthesis in (\w+)\ is for?
If the "key" can contain spaces, I guess you could match anything but an = instead of \w+. The parentheses capture the part parentheses - they're called capturing groups, they allow you to extract the sub-matches there outside in Javascript - the matched key portion in the parentheses goes into the prop variable because it's the first capture group.
Sorry, I'm not sure how to match but an = instead of \w+. It seems the change I made is unnecessary but not sure how capture everything before =.
To match anything but a =, use a negative character set, [^=], see snippet

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.