2

I need a regular expression that will detect a filename from a string - see if it ends with .pdf, or .PDF - case insensitive. Should even match .Pdf or any variant of pdf in case the user has messy filenames.

This will be used in PHP 5. I know I can make a bunch of rows to test against each case, but I'm sure there's a more elegant way to do this.

1
  • Please be more specific. Is your string just one filename all by itself, or is it a longer string with one or more filenames along with some other text? Commented Apr 13, 2011 at 23:56

5 Answers 5

9

There is nothing wrong with a regex, but there is also a ready-made function for dissecting a path and extracting the extension from it:

echo pathinfo("/path/to/myfile.txt", PATHINFO_EXTENSION); //.txt
Sign up to request clarification or add additional context in comments.

4 Comments

Except this isn't actually pointing towards a file - the "title" field of the PDF attachments are determined by the filename that is put up, and then the users can override it (so a messy PDF title like REPORT_2008_PICNET.pdf can be turned into PICNET Financial Report (2008) but sometimes they may not override it, in which case I need to figure out whether to add the .pdf to the download headers or not. Thanks though.
@jeffkee I'm not sure I'm following you: Do you need to make sure that the file ends in .pdf? Then fetching the extension using pathinfo() will work. If you compare the lowercase version against "pdf", you will know whether you need to add the extension or not. Whether what you use is a full path, or a pure file name, should be meaningless to pathinfo().
But doesn't the pathinfo() function only work on an actual file? These PDF "title" fields are fictitious, in a way. All PDFs are saved on the server in attachment[id].pdf format, and each has a corresponding title in the MySQL database.. hence the field string needs to be checked for the extension, not the actual file. All the files are automatically re-named to be .pdf anyway.
@jeffkee print_r(pathinfo("somefile (2).test.txt.pdf")); works as expected for me. realpath() depends on the specified path actually existing; pathinfo() does not.
0

how about this one. I dont know what language you are using but here is a regex for matching anything ending in .pdf

.+([.][Pp][Dd][Ff]){1}

my bad. Im half asleep. PHP it is. dont know php but that regex should work

Comments

0

another possibility is to tolower the extension

strtolower(pathinfo("/path/file.pDf", PATHINFO_EXTENSION)) == ".pdf"

Comments

0

As others have noted, extracting the extension would work, otherwise you can do something like this.

preg_match('/.*\.pdf/i', "match_me.pDf", $matches);

1 Comment

Close, but /.*\.pdf/i needs an end-of-string anchor. Otherwise it matches: file.pdflipingout.txt
0

If your string consists of a single filename here is a simple regex solution:

if (preg_match('/\.pdf$/i', $filename)) {
   // Its a PDF file
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.