0

I am attempting to scrape a website using the PHP Simple HTML DOM Parser

It works fine but when I try to get the data from this link.

I want to scrape the following text in the tag below but find impossible: 167/3 (48.0 ov)

<div class="team-1-name">

        India

            <span class="innings-1-score ">457</span> &amp; 

            <span class="innings-1-score innings-current">167/3 (48.0 ov)</span>


</div>

I've tried numerous combinations like the below without success:

$file_string = file_get_contents("http://www.espncricinfo.com/england-v-india-2014/engine/match/667711.html"); 

foreach($html->find('div[class=team-1-name]') as $team) {
     echo $team
     foreach($team->find('span[class=innings-1-score innings-current]') as $inn) {
         echo $inn;
     }
}

echo $team works and gives me "India" as expected but echo $inn returns nothing.

What am I doing wrong? I have been wracking my brain about this for days - any help is much appreciated.

Thanks in advance.

4
  • Most of the data on that page is created dynamically by Javascript reading a JSON file. You won't be able to parse what you want by reading the page source code - you need to read the JSON file. Commented Jul 13, 2014 at 1:39
  • Thanks.. So the $team is not JSON but $inn is? Do you know how I might parse the JSON? Commented Jul 13, 2014 at 1:41
  • JavaScript reads JSON data from server so you have to do the same. You have to analize all connection between browser and server (for examplu using Firebug in Firefox) and find urls used by JavaScript to get JSON data. Then you can get JSON data from this urls. Commented Jul 13, 2014 at 1:59
  • its gonna be hard to rely on just simple-html-dom since the values inside the div are dynamically (live) fed by ajax, check (most likely) the network tab on developer console and check incoming network Commented Jul 13, 2014 at 2:12

1 Answer 1

1

JavaScript reads JSON data from server so you have to do the same. You have to analize all connection between browser and server (for examplu using Firebug in Firefox) and find urls used by JavaScript to get JSON data. Then you can get JSON data from this urls

For example try this url: it is HTML - part of page - you can open it in browser.

http://www.espncricinfo.com/england-v-india-2014/engine/match/667711.html?view=scorecard;wrappertype=none;xhr=1

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.