I need to create a Javascript object representation of a string, that includes style information. The style identifiers are unimportant but for the sake of this question lets use the identifiers that stackoverflow uses:
*text* = italic
**text** = bold
***text*** = bold italic
The data representation I would like to create is an array of objects, in order as they appear in the string, with each object being as follows:
{
stringpart : (string),
style : (normal | bold | italic | bold italic)
}
Therefore given the following string:
This is some example text, with some **bold** and *italic* ***styles***.
Should be converted into the following array of objects:
[
{
stringpart : "This is some example text, with some ",
style : "normal"
},
{
stringpart : "bold",
style : "bold"
},
{
stringpart : " and ",
style : "regular"
},
{
stringpart : "italic",
style : "italic"
},
{
stringpart : " ",
style : "normal"
},
{
stringpart : "styles",
style : "bold italic"
},
{
stringpart : ".",
style : "normal"
}
]
So far I have began looking at html parsers and come across the following code:
var
content = 'This is some <b>really important <i>text</i></b> with <i>some <b>very very <br>very important</b> things</i> in it.',
tagPattern = /<\/?(i|b)\b[^>]*>/ig,
stack = [],
tags = [],
offset = 0,
match,
tag;
while (match = tagPattern.exec(content)) {
if (match[0].substr(1, 1) !== '/') {
stack.push(match.index - offset);
} else {
tags.push({
tag: match[1],
from: stack.splice(-1, 1)[0],
to: match.index - offset
});
}
offset += match[0].length;
}
content = content.replace(tagPattern, '');
// now use tags array and perform needed actions.
// see stuff
console.log(tags);
console.log(content);
//example of correct result
console.log(content.substring(tags[3].from, tags[3].to));
While the regex in this code could be adapted to detect the style identifiers mentioned above, it would not output the data in the required format since it simply returns from/to indexes.
How could I efficiently convert a string, using the above identifiers into the required array/object representation?


