0

I am trying to get the following text "Hug­gies Pure Baby Wipes 4 x 64 per pack" shown in the code below.

<div class="offerList-item-description-title">
    <div id="result-title-5" class="offerList-item-description-title">
        <script type="text/javascript">
            document.write(getContents('wF8UD9Jj8:6D !FC6 q23J (:A6D c I ec A6C A24\<'));
        </script>Hug­gies Pure Baby Wipes 4 x 64 per pack
    </div>
</div>

I have tried using code such as:

foreach($element -> find('.offerList-item-description-title') as $title)
{
    foreach($element -> find('text') as $text){
        echo $text;
    }
}

But just get returned an empty string, any suggestions?

Thanks.

1
  • I'm not familiar with this package, but I'd say that $element -> find('text') is your problem. There is no text tag. I would think that instead of the second foreach you'd want something like $title->innertext Commented Mar 14, 2018 at 13:26

2 Answers 2

1

If you are aware your HTML returned by your scraper does not contain Javascript rendered code, like in your case text is generated by javascript that's why you are getting empty response. What you need is a headless browser like PhantomJS you can use PHP wrapper of PhantomJS http://jonnnnyw.github.io/php-phantomjs/.

This will solve your problem. It has following features:

  • Load webpages through the PhantomJS headless browser
  • View detailed response data including page content, headers, status code etc.
  • Handle redirects
  • View javascript console errors

Hope this helps.

Sign up to request clarification or add additional context in comments.

Comments

0

I'm not sure what code your using in your example (and I suspect the getContents function result gets in the way of your method for retrieving the text) but if you wrap the text you're after in a <span> like so:

<div class="offerList-item-description">
    <div id="result-title-5" class="offerList-item-description-title">
        <script type="text/javascript">
            document.write(getContents('wF8UD9Jj8:6D !FC6 q23J (:A6D c I ec A6C A24\<'));
        </script><span>Hug­gies Pure Baby Wipes 4 x 64 per pack</span>
    </div>
</div>

you can retrieve it using javascript:

<script>
    var $title = document.getElementsByClassName("offerList-item-description-title");
    for (var i = 0; i < $title.length; i++) {
        var span = $title[i].getElementsByTagName("span");
        var $text = span[0].innerText || span[0].textContent;
        //echo $text;
        console.log("==> " + $text);
    }
</script>

1 Comment

Unfortunately we cant add <span> to the script because its scraped directly off the website. Unless of course it is possible to add a <span> to the scraped <div>. We're using Siple Html Dom Parser for scraping.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.