i try to automate a download from a HTML-datasheet to generate a customized reporting. The following i was doing with CURL:
// init cURL HTTP Client
$header = array();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7');
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_COOKIEFILE, '/.cookies');
curl_setopt($ch, CURLOPT_COOKIEJAR, '/.cookies');
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_FAILONERROR, TRUE);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 600);
curl_setopt($ch, CURLOPT_URL, 'https:// ... /signin.html');
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, "username=".$login."&password=".$pass);
$response = curl_exec($ch);
The login works fine and i can get many pages without any problems. Now i try to get the datasheet by the following:
curl_setopt($ch, CURLOPT_URL, 'https:// ... /data.html');
curl_setopt($ch, CURLOPT_POST, FALSE);
curl_setopt($ch, CURLOPT_POSTFIELDS, '');
$response = curl_exec($ch);
But now i get the following answer:
<html>
<head>
<script language='javascript'>function autoNavigate() {window.location="/data.html";}</script>
</head>
<body onload='autoNavigate()'></body>
</html>
The javaScript call refresh the same page as i loaded before. In a browser it works fine, but if i load the same page again with "curl_exec($ch)" i've got a 302-error?
Is there a possibilty the refresh the page with curl without a full reload? Or any other idea to get the content of the page?
Thanks
CURLOPT_FOLLOWLOCATIONset in the secondcurlcall?CURLOPT_FOLLOWLOCATIONin the second call. Yes, 302 isnt an error, but it redirects to an error page.