Regex for script tag in PHP

Question

I am using a regex to grab content of all script tag of an html page. the regex and code I use is like:

$content = file_get_contents($url, false, stream_context_create(
                    array("http" => array("user_agent" => "any"))
            ));

$pattern = "/<script[^>]*?>([\s\S]*?)<\/script>/";
preg_match_all($pattern, $content, $inside_script_array);

echo "<pre>";
print_r($inside_script_array);
echo "</pre>";

when I take 1.>

$url = 'http://www.bestylish.com/' ;

it returns me all the script tag . but when I take 2.>

$url = 'http://www.bestylish.com/sale' ;

it doesn't reply me many tags which are same and present in above url 1. What should be the reason ?

possible duplicate of How to parse HTML with PHP?

Piskvor left the building
– Piskvor left the building

2012-06-25 10:10:25 +00:00
Commented Jun 25, 2012 at 10:10 — Piskvor left the building
– Piskvor left the building, Commented Jun 25, 2012 at 10:10

Álvaro González · Accepted Answer · 2012-06-25 10:01:11Z

4

The reason is that regular expressions are not a good tool to manipulate HTML. If you still have the option to switch to a DOM parser, fetching <script> tags can be as simple as:

$domd = new DOMDocument();
libxml_use_internal_errors(true);
$domd->loadHTML(file_get_contents('http://www.google.com'));
libxml_use_internal_errors(false);

$items = $domd->getElementsByTagName('script');
$data = array();

foreach($items as $item) {
  $data[] = array(
    'src' => $item->getAttribute('src'),
    'outerHTML' => $domd->saveHTML($item),
    'innerHTML' => $domd->saveHTML($item->firstChild),
  );
}

print_r($data);

answered Jun 25, 2012 at 10:01

Álvaro González

147k45 gold badges282 silver badges378 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Regex for script tag in PHP

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related