4

After reading all related threads i can not find anything that shows regex that is capable of extracting a full json object from within html content so im hoping someone can help me get the right regex to resolve the issue.

For example the json im looking to extract looks like this:

"taxonomy": {"page":"/products/1/","price":"350.00","country_code":"gb","brand":"apple"},

Im trying to extract the entire "taxonomy" object that is inside a java script function within the html.

I have tried preg_match('/\taxonomy\s*=(.+)(?:;|/', $file, $m); but having no joy and regex is something im trying to learn.

Im aiming to have the regex parse the html and pull the taxonmy object from the html so im left with the following: {"page":"/products/1/","price":"350.00","country_code":"gb","brand":"apple"} that i can then json_decode

I would be greatly appreciate if someone could help me get to the correct regex, Thanks.

2
  • Why the downvote? We all need help in life at some point and we all start off at some point. I guess asking for a little help from people with more experience is a bad thing? Commented Aug 25, 2017 at 10:17
  • Don't worry about down vote, someone just hands faster than their brain, check out my answer. Commented Aug 25, 2017 at 10:22

1 Answer 1

1

This regex pattern should work, but it depends on what is your full HTML looks like

<?php
$file = '"taxonomy": {"page":"/products/1/","price":"350.00","country_code":"gb","brand":"apple"},
';
preg_match('@"taxonomy":(.*?)\},@s', $file, $m);

if(!empty($m[1])){
    $jsonString = "[".$m[1] . "}]";
    $array = json_decode($jsonString, true);
    print_r($array);
}

https://regex101.com/r/fytDO8/1/

Sign up to request clarification or add additional context in comments.

3 Comments

Your regex ignores the closing bracket and captures all the whitespace before the opening bracket.
I was able to get this working from the fiddle but with the above answer it was throwing a unknown modifier 'g' error however from the fiddle you provided i was able to get it working using the following: /"taxonomy":(.*?)\},/
So i am very very great full for your time and effort it is greatly appreciated. Thanks a lot :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.