18

I'm trying to split a string using a regular expression.

Friday 1Friday 11 JAN 11

The output I want to achieve is

['Friday 1', 'Friday 11', ' JAN 11']

My snippet so far is not producing the desired results:

>>> import re
>>> p = re.compile(r'(Sunday|Monday|Tuesday|Wednesday|Thursday|Friday|Saturday)\s*\d{1,2}')
>>> filter(None, p.split('Friday 1Friday 11 JAN 11'))
['Friday', 'Friday', ' JAN 11']

What am I doing wrong with my regex?

0

3 Answers 3

23

The problem is the capturing parentheses. This syntax: (?:...) makes them non-capturing. Try:

p = re.compile(r'((?:Friday|Saturday)\s*\d{1,2})')
Sign up to request clarification or add additional context in comments.

2 Comments

That's exactly what I was after! I knew it was something small. Thanks
I was getting close with p = re.compile(r'((Friday|Saturday)\s*\d{1,2})') but didn't understand why I was getting 2 results for each group. Makes complete sense now though, it was producing the result + the group name back reference.
6

You can also use 're.findall' function.

\>>> val  
'Friday 1Friday 11 JAN 11 '  
\>>> pat = re.compile(r'(\w+\s*\d*)')  
\>>> m=re.findall(pat,val)  
\>>> m  
['Friday 1', 'Friday 11', 'JAN 11']

Comments

0
p = re.compile(r'(Friday\s\d+|Saturday)')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.