2

I'm doing the Parser of a page with the 'simple_html_dom' library however not got success to parse an html whose content is obtained through ajax. is there any way around this?

PHP Code:

<?php
require_once '../library/Simple_HTML_DOM/simple_html_dom.php';

// Create DOM from URL or file
$html = file_get_html('http://www.playnow3dgames.com/genre.php?id=sports');

// Find all images 
foreach($html->find('img') as $element){
echo $element->src . '<br>';
}
?>

Only prints the images on the edges and top (html native) and Is not parsed the center images (using ajax).

3
  • is this page playnow3dgames.com/listing.php?genre=sports&order=date ? Commented Feb 11, 2015 at 13:36
  • You mean you don’t get content that is only added to the page via JavaScript? Well of course not, because file_get_contents doesn’t “execute” JavaScript. You need something that emulates a browser for that (research keyword: headless browser) Commented Feb 11, 2015 at 13:41
  • Thank you, looking forward to phamtomjs, hope it works in my case Commented Jan 2, 2017 at 4:23

1 Answer 1

0

try with this

<?php
require_once '../library/Simple_HTML_DOM/simple_html_dom.php';

// Create DOM from URL or file
$html = file_get_html('http://www.playnow3dgames.com/listing.php?genre=sports&order=date');

// Find all images 
foreach($html->find('img') as $element){
    echo $element->src . '<br>';
}
?>

=== UPDATE ====

actualy, this a iframe, it is not ajax. in center of http://www.playnow3dgames.com/genre.php?id=sports is frame: http://www.playnow3dgames.com/listing.php?genre=sports&order=date

you can see struct of url:

http://www.playnow3dgames.com/listing.php?genre=sports&order=date

at here: genre=sports

this is real url: http://www.playnow3dgames.com/genre.php?id=sports

you will see match of id=sports with genre=sports

to get for every pages, you only need change genre=genre_name. for example:

http://www.playnow3dgames.com/genre.php?id=strategy

the main frame will be:

www.playnow3dgames.com/listing.php?genre=strategy&order=date

if you want to get page 1,2,3..., you need add page=page_number. for example: get page 2 of

http://www.playnow3dgames.com/genre.php?id=strategy

url will be:

http://www.playnow3dgames.com/listing.php?genre=strategy&page=2&order=date
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for answer ,@MrJerry, 1st where is this listing.php? 2nd works but works for every site with use ajax?
see my update for get every pages

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.