Checking whether a string contains some characters in python

Question

I want to check if a string only contains A-Z and a-z and 0-9 and underscore and dash (_ -)

Any other special signs like !"#\% should not be contained

How can I write the regular expression?

and use match or ?

My strings look like these: QOIWU_W QWLJ2-1

Some programmer dude · Accepted Answer · 2011-12-07 15:12:00Z

9

Yes, re.match seems like a good match (pardon the pun). As for the regular expression, how about something like this: '[A-Za-z0-9-_]*'?

answered Dec 7, 2011 at 15:12

Some programmer dude

411k36 gold badges420 silver badges655 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

pcalcao Over a year ago

You can also use [\w-] instead of [A-Za-z0-9-_]

mac · Accepted Answer · 2011-12-07 15:44:16Z

9

Using re doesn't harm in any way, but just for scientific curiosity, another approach that doesn't require you to pass through re is using sets:

>>> valid = set('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_ ')
>>> def test(s):
...    return set(s).issubset(valid)
... 
>>> test('ThiS iS 4n example_sentence that should-pass')
True
>>> test('ThiS iS 4n example_sentence that should fail!!')
False

For conciseness, the testing function could also be written:

>>> def test(s):
...    return set(s) <= valid

EDIT: A bit of timing for the sake of curiosity (times are in seconds, for each test implementation it runs three sets of iterations):

>>> T(lambda : re.match(r'^[a-zA-Z0-9-_]*$', s)).repeat()
[1.8856699466705322, 1.8666279315948486, 1.8670001029968262]
>>> T(lambda : set(y) <= valid).repeat()
[3.595816135406494, 3.568570852279663, 3.564558982849121]
>>> T(lambda : all([c in valid for c in y])).repeat()
[6.224508047103882, 6.2116711139678955, 6.209425926208496]

edited Dec 7, 2011 at 15:44

answered Dec 7, 2011 at 15:23

mac

43.2k27 gold badges126 silver badges133 bronze badges

2 Comments

Michael J. Barber Over a year ago

You don't need the list calls to get the sets of characters.

mac Over a year ago

@MichaelJ.Barber - Thank you, fixed (and it took off 1 sec from the timings...)

Oliver · Accepted Answer · 2011-12-07 15:10:40Z

1

You can use the regular expression module.

import re
if (re.match('^[a-zA-Z0-9-_]*$',testString)):
    //successful match

answered Dec 7, 2011 at 15:10

Oliver

11.7k18 gold badges76 silver badges127 bronze badges

3 Comments

Some programmer dude Over a year ago

What kind of Python has that syntax?

manxing Over a year ago

@Oliver thank you, but I guess, ^ and $ are required in PHP, not in python.

Oliver Over a year ago

@manxing Not so. ^ and $ mark the start and end of the string.

Fredrik Pihl · Accepted Answer · 2011-12-07 15:29:34Z

0

No need to go regexp.

import string

# build a string containing all valid characters
match=string.ascii_letters + string.digits + '_' + '-' + ' '

In [25]: match
Out[25]: 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_- '

test='QOIWU_W QWLJ2-'

In [22]: all([c in match for c in test])
Out[22]: True

In [23]: test2='abc ;'

In [24]: all([c in match for c in test2])
Out[24]: False

answered Dec 7, 2011 at 15:29

Fredrik Pihl

45.9k7 gold badges89 silver badges133 bronze badges

1 Comment

Fredrik Pihl Over a year ago

time for in is linear with the length of the search string so it wasn't a major surprise. Thanks for the benchmark though!

U-DON · Accepted Answer · 2011-12-07 15:42:59Z

-1

import re
re.search('[^a-zA-Z0-9-_]+', your_string) == None

re.search() will return a match object if it comes across any instances of one or more non-alphanumeric characters and None otherwise. So you'd be checking if the string is safe.

edited Dec 7, 2011 at 15:42

answered Dec 7, 2011 at 15:14

U-DON

2,12815 silver badges14 bronze badges

Collectives™ on Stack Overflow

Checking whether a string contains some characters in python

5 Answers 5

1 Comment

2 Comments

3 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

2 Comments

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related