Extracting number from string in Python with regex

Question

I want to extract and print a variable number '-34.99' from the string:

myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09"

The values in the string will change. How can I do it with the regular expression in Python?

Thanks in advance

You always want the second number in the / separated list? — João Silva
– João Silva, Commented Aug 26, 2012 at 14:08
If the string always looks like that, regex is overkill: myString.split()[-1].split("/")[1] — DSM
– DSM, Commented Aug 26, 2012 at 14:32
DSM, in my code a myString is large number of rows. I use regex and re.compile to find others strings. Thanks for your useful answer — dmaster
– dmaster, Commented Aug 26, 2012 at 14:57

Ωmega · Accepted Answer · 2012-08-26 14:33:12Z

14

Non-regex solution is:

myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09"  
print myString.split("/")[1]

Test this code here.

One of regex solutions would be:

import re 
myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09" 
print re.search(r'(?<=\/)[+-]?\d+(?:\.\d+)?', myString).group()

Test this code here.

Explanation:

(?<=\/)[+-]?\d+(?:\.\d+)?
└──┬──┘└─┬─┘└┬┘└───┬────┘
   │     │   │     │
   │     │   │     └ optional period with one or more trailing digits
   │     │   │
   │     │   └ one or more digits
   │     │
   │     └ optional + or -
   │
   └ leading slash before match

edited Aug 26, 2012 at 14:33

answered Aug 26, 2012 at 14:28

Ωmega

44k35 gold badges143 silver badges213 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

dmaster Over a year ago

Great! It works! Thanks to all!. Omega, your answer is the best!

jon Over a year ago

You might consider .split('/')[-3]. Getting "third-last" Would protect from descriptions containing "/" character.

user648852 Over a year ago

Your regex only works for fields beyond the first. ie, -35.00 is not matched since there is no leading '/'

Ωmega Over a year ago

@Jon - I believe slash is a separator, so there should be no such character in description

user648852 Over a year ago

Chill mate. I pointed this out for the benefit of the OP (or other reading this answer) in case he wanted to change the field he was looking for to the first.

|

the wolf · Accepted Answer · 2012-08-26 15:01:50Z

1

For something like this, re.findall works great:

>>> import re
>>> myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09"
>>> re.findall(r'([+-]?\d+\.\d+)',myString)
['-35.00', '-34.99', '-34.00', '0.09']

You can get the floats directly with a list comprehension:

>>> [float(f) for f in re.findall(r'([+-]?\d+\.\d+)',myString)]
[-35.0, -34.99, -34.0, 0.09]

Or the second one like this:

>>> re.findall(r'([+-]?\d+\.\d+)',myString)[1]
'-34.99'

The question will be how big a range of textual floating points will you accept? Some with no decimal points? Exponents?

>>> myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09/5/1.0e6/1e-6"

Ouch! -- this is getting harder with a regex.

You actually may be better off just using Python's string ops:

>>> ''.join([s for s in myString.split() if '/' in s]).split('/')
['-35.00', '-34.99', '-34.00', '0.09', '5', '1.0e6', '1e-6']

You can get the nth one same way:

>>> n=2
>>> ''.join([s for s in myString.split() if '/' in s]).split('/')[n]
'-34.00'

Then all the weird cases work without a harder regex:

>>> map(float,''.join([s for s in myString.split() if '/' in s]).split('/'))
[-35.0, -34.99, -34.0, 0.09, 5.0, 1000000.0, 1e-06]

edited Aug 26, 2012 at 15:01

answered Aug 26, 2012 at 14:37

the wolf

35.7k13 gold badges57 silver badges73 bronze badges

3 Comments

DSM Over a year ago

I was a little worried about the units, though: if "[cm]" were "[cm/s]", then the first few terms would be weird..

user648852 Over a year ago

@DSM: yes, I guess you could validate the split pieced as having digits or use a try block. The OP only seemed concerned about the second match since the chosen regex only works for matches beyond the first field.

DSM Over a year ago

That's why I went with myString.split()[-1].split("/")[1], although it will fail on other cases, too. Maybe map(float, myString[myString.find(':')+1:].split("/"))? With only one example to go from, it's hard to know how general to be, or what invariants we can rely on. I do like the idea of using the separator instead of writing a complicated regex to handle all the cases, though.

Collectives™ on Stack Overflow

Extracting number from string in Python with regex

2 Answers 2

7 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related