4

I am trying to validate XML file path in javascript. My REGEX is:

var isValid = /^([a-zA-Z]:)?(\\{2}|\/)?([a-zA-Z0-9\\s_@-^!#$%&+={}\[\]]+(\\{2}|\/)?)+(\.xml+)?$/.test(str);

It returns true even when path is wrong. These are valid paths

D:/test.xml
D:\\folder\\test.xml
D:/folder/test.xml
D:\\folder/test.xml
D:\\test.xml
4
  • 2
    would be helpful if you say which path should not be valid. i assume it should always end with an xml file. but is the D:/ required ? or is also a path starting with / valid ? Commented Apr 26, 2013 at 7:56
  • There can be any character instead of D. But it should end on .xml Commented Apr 26, 2013 at 8:00
  • sorry was unclear about it. is the driver letter required in a valid path or would /folder/test.xml also be valid ? Commented Apr 26, 2013 at 8:02
  • yeah drive letter is required. Commented Apr 26, 2013 at 8:05

3 Answers 3

6

At first the obvious errors:

+ is a repeat indicator that has the meaning at least one.
so the (\.xml+) will match everything starting with .xm followed by one or more l (it would also match .xmlllll). the ? means optional, so (\.xml+)? has the meaning it could have an .xml but it is not required.

the same is for ([a-zA-Z]:)? this means the driver letter is optional.

Now the not so obvious errors

[a-zA-Z0-9\\s_@-^!#$%&+={}\[\]] here you define a list of allowed chars. you have \\s and i assume you want to allow spaces, but this allows \ and s so you need to change it to \s. then you have this part @-^ i assume you want to allow @, - and ^ but the - has a special meaning inside of [ ] with it you define a range so you allow all chars that are in the range of @ to ^ if you want to allow - you need to escape it there so you have to write @\-^ you also need to take care about ^, if it is right after the [ it would have also a special meaning.

your Regex should contain the following parts:

  • ^[a-z]: start with (^) driver letter
  • ((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+ followed by one or more path parts that start with either \ or / and having a path name containing one or more of your defined letters (a-z0-9\s_@\-^!#$%&+={}\[\])
  • \.xml$ ends with ($) the .xml

therefore your final regex should look like this
/^[a-z]:((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+\.xml$/i.test(str)
(under the assumption you do a case insensitve regex using the i flag)

EDIT:

var path1 = "D:/test.xml";               // D:/test.xml
var path2 = "D:\\folder\\test.xml";      // D:\folder\test.xml
var path3 = "D:/folder/test.xml";        // D:/folder/test.xml
var path4 = "D:\\folder/test.xml";       // D:\folder/test.xml
var path5 = "D:\\test.xml";              // D:\test.xml

console.log( /^[a-z]:((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+\.xml$/i.test(path1) );
console.log( /^[a-z]:((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+\.xml$/i.test(path2) );
console.log( /^[a-z]:((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+\.xml$/i.test(path3) );
console.log( /^[a-z]:((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+\.xml$/i.test(path4) );
console.log( /^[a-z]:((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+\.xml$/i.test(path5) );

UPDATE:

you need to take care about the / and the \ if you need to escape them depends on if you use it with new RegExp(' ... the regex ... ',"i") and new RegExp(" ... the regex ... ","i") or with / ... the regex ... /i

for further informations about regular expressions you should take a look at e.g. www.regular-expressions.info

Sign up to request clarification or add additional context in comments.

5 Comments

update regex for this path too. var path4 = "D:\folder\folder/test.xml";
@ImranTariq "D:\folder\folder/test.xml" is not D:\folder\folder/test.xml \f is like \n or \r a special char ... check console.log("\f".length) which has the length of 1 and is not \ + f
I mean path may contain one backslash \ i.e. D:\folder\folder/test.xml
@ImranTariq one backslash: "D:\\name.xml" two backslashes: "D:\\\\name.xml" no backslash, but a newline (\n): "D:\name.xml". (you need to escape backslashes in strings with a backslash !)
regex fails if folder name contains dots. fixed regex /^[a-z]:((\\|\/)[a-z0-9\s_@\-^!.#$%&+={}\[\]]+)+\.xml$/i
0

This could work out for you

var str = 'D:/test.xml';
var str2 = 'D:\\folder\\test.xml';
var str3 = 'D:/folder/test.xml';
var str4 = 'D:\\folder/test.xml';
var str5 = 'D:\\test\\test\\test\\test.xml';

var regex = new RegExp('^[a-z]:((\\\\|\/)[a-zA-Z0-9_ \-]+)+\.xml$', 'i'); 
regex.test(str5);

The reason of having \\\\ in RegExp to match a \\ in string is that javascript uses \ to escape special characters, i.e., \n for new lines, \b for word boundary etc. So to use a literal \, use \\. It also allows you to have different rules for file name and folder name.

Update

[a-zA-Z0-9_\-]+ this section of regexp actually match file/folder name. So to allow more characters in file/folder name, just add them to this class, e.g., to allow a * in file/folder name make it [a-zA-Z0-9_\-\*]+

Update 2

For adding to the answer, following is an RegExp that adds another check to the validation, i.e., it checks for mixing of / and \\ in the path.

var str6 = 'D:/This is folder/test @ file.xml';
var str7 = 'D:/This is invalid\\path.xml'
var regex2 = new RegExp('^[a-z]:(\/|\\\\)([a-zA-Z0-9_ \-]+\\1)*[a-zA-Z0-9_ @\-]+\.xml?', 'gi');

regex2 will match all paths but str7

Update

My apologies for mistyping a ? instead of $ in regex2. Below is the corrected and intended version

var regex2 = new RegExp('^[a-z]:(\/|\\\\)([a-zA-Z0-9_ \-]+\\1)*[a-zA-Z0-9_ @\-]+\.xml$', 'i');

1 Comment

This also validates 'D:/testxml' i.e. no dot before xml
0

Tested using Scratchpad.

var regex = /^[a-z]:((\/|(\\?))[\w .]+)+\.xml$/i;

Prints true in Web Console: (Ctrl+Shift+K on Firefox)

console.log(regex.test("D:/test.xml"));
console.log(regex.test("D:\\folder\\test.xml"));
console.log(regex.test("D:/folder/test.xml"));
console.log(regex.test("D:\\folder/test.xml"));
console.log(regex.test("D:\\test.xml"));
console.log(regex.test("D:\\te st_1.3.xml")); // spaces, dots allowed

Or, using Alert boxes:

alert(regex.test("D:/test.xml"));
alert(regex.test("D:\\folder\\test.xml"));
alert(regex.test("D:/folder/test.xml"));
alert(regex.test("D:\\folder/test.xml"));
alert(regex.test("D:\\test.xml"));
alert(regex.test("D:\\te st_1.3.xml"));

Invalid file paths:

alert(regex.test("AD:/test.xml")); // invalid drive letter
alert(regex.test("D:\\\folder\\test.xml")); // three backslashes
alert(regex.test("/folder/test.xml")); // drive letter missing
alert(regex.test("D:\\folder/test.xmlfile")); // invalid extension

2 Comments

@ImranTariq Rewrote regex using JavaScript style case-insensitive flag.
Is your input one single file path or a bunch of them each on a new line?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.