1

--

Hi everyone,

I need a hand for the following regex. The string is something like:

str = 'value=\"20\" />\r\n\t\r\n<\/div>","whatiwant":"<div id=\"whatiwant\">\r\n\t\r\n\t\t<\/div>","idontwanthat":"<div id=\"idontwanthat\">\r\n\t\r\n\t blablalblalblalbla \t\r\n\t\t\t<\/div>"'

I would like the entire div of "whatiwant". I tried the following:

matches=re.findall(r'\"whatiwant\":\"(.+?)\":\"',mstr)

ps: i can have other div in the div.

Any help with me appreciated

2
  • An html parser would be more suitable for this. Is this really your string or a part of a web page? Commented Sep 19, 2014 at 9:43
  • Hi jerry, i know but the string is not suitable for an html parser. i will use one for the div that i want Commented Sep 19, 2014 at 9:45

2 Answers 2

1
"whatiwant":"(.*?[^\\])??"

This will match the literal "whatiwant": and then anything (even an empty string) inside double quotes "".

If you want to extract the div's html code, you can retrieve the first group's value:

matches=re.findall(r'"whatiwant":"(.*?[^\\])??"', mstr)
for match in matches:
    html= match.group(1)
Sign up to request clarification or add additional context in comments.

Comments

1

Try using a positive lookahead -

\"whatiwant\":.*(?=,\".*?\"\:)

DEMO

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.