1

I want to create site map for my site. So before creating sitemap, i want to know the status code of each url. I have used curl option to deduct status code. I have more than 400 urls in my site. if i use curl, its taking long time.

Only i want to allow the url which is contain status code 200.

Could you please any one tell me any other option to deduct each url's status code.

I have used below curl code.

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, 1);  
curl_setopt($ch, CURLOPT_POSTFIELDS, $urlparam);  
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);  
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);  
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_TIMEOUT, 240);  

curl_exec($ch);
$curlcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
echo $curlcode;

reference link

3
  • If you really going to check the availability of each page on every site-reload, your website will be very slow. Commented Jul 6, 2012 at 15:13
  • 1
    why not just use a sitemap builder? i'm pretty sure there some decent tools for this. Commented Jul 6, 2012 at 15:15
  • @tradyblix, my site is multilingual. also running in drupal. Commented Jul 7, 2012 at 8:38

1 Answer 1

1

I ran into this same issue a few months ago. I found that using this code example to access my own page status codes was much faster:

<?php
//
// Checking the status of a web page - funmin.com
//

$server="www.YOUR_WEBSITE.com";
function sockAccess($page)
{
   $errno = "";
   $errstr = "";
   $fp = 0;
   global $server;
   $fp = fsockopen($server, 80, $errno, $errstr, 30);

   if ($fp===0)
   {
      die("Error $errstr ($errno)");
   }
   $out = "GET /$page HTTP/1.1\r\n";
   $out .= "Host: $server\r\n";
   $out .= "Connection: Close\r\n\r\n";

   fwrite($fp,$out);
   $content = fgets($fp);
   $code = trim(substr($content,9,4));
   fclose($fp);
   return intval($code);
}
?>

Further documentation may be found here: http://www.forums.hscripts.com/viewtopic.php?f=11&t=4217

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Hope :) But its also taking long time. ie, for more than 400 url's, its taking more than 4 mins.
How about using $out = "HEAD /$page HTTP/1.1\r\n";? Do you also take into account how slow/fast are the websites you are trying to query?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.