0

I have a function that detects all files started by a string and it returns an array filled with the correspondent files, but it is starting to get slow, because I have arround 20000 files in a particular directory. I need to optimize this function, but I just can't see how. This is the function:

function DetectPrefix ($filePath, $prefix)
{

    $dh  = opendir($filePath);
    while (false !== ($filename = readdir($dh)))        
    {   
        $posIni = strpos( $filename, $prefix);
        if ($posIni===0):                           
            $files[] = $filename;                       
        endif;
    }

    if (count($files)>0){                               
        return $files;
    } else {
        return null;
    }

}

What more can I do?

Thanks

3
  • How about changing that directory structure? You could create subdirectories based a the first characters of the file names, which would make it even faster to find the relevant files. No need to look into a/a/aa* when you're looking for bb* Commented Jul 8, 2009 at 14:23
  • 2
    glob may help for a time, but KM is right: in the long run you have to do something about your directory structure. having that many files in one directory can lead to a lot of problems. Commented Jul 8, 2009 at 15:06
  • I just can't do nothing about the huge number of file in that directory. Ancient software: must live with it ;) Thank you for all answers! Commented Jul 9, 2009 at 9:47

5 Answers 5

11

http://php.net/glob

$files = glob('/file/path/prefix*');

Wikipedia breaks uploads up by the first couple letters of their filenames, so excelfile.xls would go in a directory like /uploads/e/x while textfile.txt would go in /uploads/t/e.

Not only does this reduce the number of files glob (or any other approach) has to sort through, but it avoids the maximum files in a directory issue others have mentioned.

Sign up to request clarification or add additional context in comments.

Comments

4

You could use scandir() to list the files in the directory, instead of iterating through them one-by-one using readdir(). scandir() returns an array of the files.

However, it'd be better if you could change your file system organization - do you really need to store 20000+ files in a single directory?

Comments

2

As the other answers mention, I'd look at glob(), scandir(), and/or the DirectoryIterator class, there is no need to recreate the wheel.

However watch out! check your operating system, but there may be a limit on the maximum number of files in a single directory. If this is the case and you just keep adding files in the same directory you will have some downtime, and some problems, when you reach the limit. This error will probably appear as a permissions or write failure and not an obvious "you can't write more files in a single directory" message.

Comments

1

I'm not sure but probably DirectoryIterator is a bit faster. Also add caching so that list gets generated only when files are added or deleted.

Comments

0

You just need to compare the first length of prefix characters. So try this:

function DetectPrefix($filePath, $prefix) {
    $dh  = opendir($filePath);
    $len = strlen($prefix);
    $files = array();
    while (false !== ($filename = readdir($dh))) {
        if (substr($filename, 0, $len) === $prefix) {
            $files[] = $filename;
        }
    }
    if (count($files)) {
        return $files;
    } else {
        return null;
    }
}

1 Comment

That's true, but IMHO, this is more a matter of I/O, than of in-memory treatment.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.