4

I have a string from a DOM element, which contains something similar to the following:

<span class='greenhornet'>Can you catch the green?</span>

I need to know the position of the word green.

In this case, if I setup a pattern /green/, JS exec() of course will return the first occurrence of green (position 13).

Is there a way to tell JS regexp to ignore ! the word green, if it's between < and > or is there an easier way to do this?

Oh, and I can't just strip the HTML either!

thanks.

4
  • 5
    Please don't parse HTML with Regex Commented Dec 19, 2012 at 19:45
  • Can you use document.getElementsByClassName for example? Commented Dec 19, 2012 at 19:47
  • Use DOM to retrieve all text nodes, concatenate the text node contents, then do your search. This would cover cases like matching "green hornet" even when split by HTML, e.g. <b>green</b> hornet. Commented Dec 19, 2012 at 20:59
  • Why can't you strip the HTML? Commented Dec 19, 2012 at 21:40

2 Answers 2

2

As the commentors (and user1883592) have suggested, stripping the HTML or parsing the text out of the HTML is the correct answer here. Using regular expressions with HTML is a loser's game; you've been warned.

But, that being said, if you really want to play that game, I'd start by ensuring there are no opening brackets in between your term and the last closing bracket; in other words:

var greenRegex = />[^<]+(green)/;
var position = "<span class='greenhornet'>Can you catch the green?</span>".search(greenRegex);
// position = 25, not 13
Sign up to request clarification or add additional context in comments.

Comments

0

You can get innerHTML of the span element. No Regex needed.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.