1

file: data.txt (11617 lines)

   user  datetime
   23   2015-03-01 08:04:15 
   15   2015-05-01 08:05:20 
  105   2015-05-01 08:07:10 
   15   2015-06-01 08:08:29 
  105   2015-06-01 08:12:48 

I only need data in 2015-06, I'm using fget and check each line's datetime but really slow, more than 50s.

    $d='data.txt';
    import($d);
    function import($d){
        $handle = fopen($d, "r") or die("Couldn't get handle");
        if ($handle) {
            while (!feof($handle)) {
                $buffer = fgets($handle, 4096);
                $line=explode("\t",$buffer);
                if(date("Y-m",strtotime($line[1])=="2015-06"){
                   mysql_query("INSERT INTO `table` ....");
                }
                else{
                   //break? when month>6
                }
            }
            fclose($handle);
        }
    }

SOLUTION: less than 2s!!!! (thanks to Kevin P. and Dragon)

            if(substr($line[1],0,7)=="2015-06"){
               $sql.=empty($sql)?"":","."(`".$line[1]."`.........)";
            }
            elseif(substr($line[1],0,7)>"2015-06"){
               break;// when month>6
            }
            mysql_query("INSERT INTO `table` ....".$sql);
2
  • for the date compare, just use substr() if( substr($line[1],0,7)== "2015-06"){ ... Commented Aug 11, 2015 at 1:34
  • 1
    @jpw: if it's a web application, making a user wait 50 seconds is huge. Commented Aug 11, 2015 at 1:37

3 Answers 3

2

Can't be helped, use something faster than PHP. For instance, you can use grep or awk to read the file and filter it quickly. For example:

$lines = explode("\n", `awk '$2 ~ /^2015-06/ { print }' data.txt`);

EDIT: Also, fgets is not guaranteed to give you whole lines. You are getting 4096 bytes at a time; the boundary could be in the middle of a line, which will make the line not match if you are lucky, or break your code due to missed assumptions (such as the length of the $line array when constructing the SQL) if not.*


*) Or vice versa - it would be better for it to break completely, that is at least an obvious error yelling to be fixed; as opposed to silent data droppage.

Sign up to request clarification or add additional context in comments.

Comments

1

Maybe insert multiple entries in to the DB at once instead of calling it every time you find a desired time?

In which case it's similar to this

1 Comment

much much much more faster! less than 2s
0

Maybe you should use grep to filter out the lines you do not need.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.