2

I have table structure:

<table class="c_order u_list">
    <thead>
        <tr>
        </tr>
    </thead>
    <tbody>
            <tr>
            <td>
                11.04.2017<br/>
                18:20            </td>
            <td><a target="_blank" href="/personal/order/detail/457/">A-457</a></td>
            <td>+7 (917) 119-11-42</td>
            <td>1685.20</td>
            <td>
                <a target="_blank" href="/resn/i/zda_2_1/">УШКА</a><br/>с. холмский, ул. Фрунзе, д. 11<br/>3477740087            </td>
            <td>Принят</td>
        </tr>
                <tr>
            <td>
                11.04.2017<br/>
                16:47            </td>
            <td><a target="_blank" href="/personal/order/detail/47565/">A-47565</a></td>
            <td>+7 (909) 556-77-99</td>
            <td>2574.80</td>
            <td>
                <a target="_blank" href="/kir/a/an_10/">ООО &quot;План&quot;</a><br/>г. Омск, ул. 10-летия Победы, д. 3;<br/>8845701069            </td>
            <td>Доставлен</td>
        </tr>

            </tbody>
</table>

I am trying to get this to an array with my PHP code:

$page = curl_exec ($ch);
curl_close ($ch);
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($page);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
$data = array();
// get all table rows and rows which are not headers
$table_rows = $xpath->query('//tr');
foreach($table_rows as $row => $tr) {
    foreach($tr->childNodes as $td) {
        echo $td->nodeValue;
        $data[$row][] = preg_replace('~[\r\n]+~', '', trim($td->nodeValue));
    }
    $data[$row] = array_values(array_filter($data[$row]));
}
print_r($data);

But I get wrong result (without a href tags) in array, but I need to get something like that including all tags in td elements:

Array
(
    [0] => Array
    (
        [0] => 11.04.2017 18:20
        [1] => <a target="_blank" href="/personal/order/detail/457/">A-457</a>
        [2] => +7 (917) 119-11-42
        [3] => 1685.20
        [4] => <a target="_blank" href="/resn/i/zda_2_1/">УШКА</a><br/>с. холмский, ул. Фрунзе, д. 11<br/>3477740087
        [5] => Принят
    )

    [1] => Array
    (
        [0] => 11.04.2017 16:47
        [1] => <a target="_blank" href="/personal/order/detail/47565/">A-47565</a>
        [2] => +7 (909) 556-77-99
        [3] => 2574.80
        [4] => <a target="_blank" href="/kir/a/an_10/">ООО &quot;План&quot;</a><br/>г. Омск, ул. 10-летия Победы, д. 3;<br/>8845701069
        [5] => Доставлен
    )

And how to give name to arrays key index? So get not [0] but ['time'].

4
  • make sure that you receive data in proper encoding. If not use header('Content-type: text/plain; charset=utf-8'); in the origin file. Also check encoding of your php file. Commented Apr 12, 2017 at 14:31
  • Didn't utf8_encode solve your problem? How about the encode of your script file? Commented Apr 12, 2017 at 14:36
  • encoding is not solved result is the same Commented Apr 12, 2017 at 14:38
  • what is the $page variable encode? Commented Apr 12, 2017 at 14:45

2 Answers 2

2

In the constructor for DOMDocument, specify the encoding as UTF-8:

$dom = new DOMDocument('1.0', 'UTF-8');

To make preg_replace() function work safely with UTF-8 string, you must use u modifier:

$data[$row][] = preg_replace('~[\r\n]+~u', '', trim($td->nodeValue));
Sign up to request clarification or add additional context in comments.

Comments

1
 $table_rows = $xpath->query('//table/tbody/tr');
 $results = array();
            foreach($table_rows as $row) {
                $result = array();
                    $expression = './td[1]';
                        $result['Name1'] = preg_replace('~[\r\n\s]+~u', '_', trim($xpath->query($expression, $row)->item(0)->nodeValue));
                    $expression = './td[2]';
                        $result['Name2'] = $xpath->query($expression, $row)->item(0)->nodeValue;
                    $expression = './td[2]/a/@href';
                        $result['NameURL'] = $xpath->query($expression, $row)->item(0)->nodeValue;


                    $expression = './td[3]';
                        $result['Phone'] = $xpath->query($expression, $row)->item(0)->nodeValue;
                    $expression = './td[4]';
                        $result['Price'] = $xpath->query($expression, $row)->item(0)->nodeValue;
                            $expression = './td[5]/a/@href';
                                $result['Name10'][] = $xpath->query($expression, $row)->item(0)->nodeValue;
                            $expression = './td[5]/a';

                    $expression = './td[6]';
                        $result['Name11'] = $xpath->query($expression, $row)->item(0)->nodeValue;
                array_push($results, $result);        
            }

    print_r($results);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.