5

I want to extract and print a variable number '-34.99' from the string:

myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09"

The values in the string will change. How can I do it with the regular expression in Python?

Thanks in advance

3
  • You always want the second number in the / separated list? Commented Aug 26, 2012 at 14:08
  • 2
    If the string always looks like that, regex is overkill: myString.split()[-1].split("/")[1] Commented Aug 26, 2012 at 14:32
  • DSM, in my code a myString is large number of rows. I use regex and re.compile to find others strings. Thanks for your useful answer Commented Aug 26, 2012 at 14:57

2 Answers 2

14

Non-regex solution is:

myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09"  
print myString.split("/")[1]

Test this code here.


One of regex solutions would be:

import re 
myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09" 
print re.search(r'(?<=\/)[+-]?\d+(?:\.\d+)?', myString).group()

Test this code here.

Explanation:

(?<=\/)[+-]?\d+(?:\.\d+)?
└──┬──┘└─┬─┘└┬┘└───┬────┘
   │     │   │     │
   │     │   │     └ optional period with one or more trailing digits
   │     │   │
   │     │   └ one or more digits
   │     │
   │     └ optional + or -
   │
   └ leading slash before match 
Sign up to request clarification or add additional context in comments.

7 Comments

Great! It works! Thanks to all!. Omega, your answer is the best!
You might consider .split('/')[-3]. Getting "third-last" Would protect from descriptions containing "/" character.
Your regex only works for fields beyond the first. ie, -35.00 is not matched since there is no leading '/'
@Jon - I believe slash is a separator, so there should be no such character in description
Chill mate. I pointed this out for the benefit of the OP (or other reading this answer) in case he wanted to change the field he was looking for to the first.
|
1

For something like this, re.findall works great:

>>> import re
>>> myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09"
>>> re.findall(r'([+-]?\d+\.\d+)',myString)
['-35.00', '-34.99', '-34.00', '0.09']

You can get the floats directly with a list comprehension:

>>> [float(f) for f in re.findall(r'([+-]?\d+\.\d+)',myString)]
[-35.0, -34.99, -34.0, 0.09]

Or the second one like this:

>>> re.findall(r'([+-]?\d+\.\d+)',myString)[1]
'-34.99'

The question will be how big a range of textual floating points will you accept? Some with no decimal points? Exponents?

>>> myString = "Test1 [cm]:     -35.00/-34.99/-34.00/0.09/5/1.0e6/1e-6"  

Ouch! -- this is getting harder with a regex.

You actually may be better off just using Python's string ops:

>>> ''.join([s for s in myString.split() if '/' in s]).split('/')
['-35.00', '-34.99', '-34.00', '0.09', '5', '1.0e6', '1e-6']

You can get the nth one same way:

>>> n=2
>>> ''.join([s for s in myString.split() if '/' in s]).split('/')[n]
'-34.00'

Then all the weird cases work without a harder regex:

>>> map(float,''.join([s for s in myString.split() if '/' in s]).split('/'))
[-35.0, -34.99, -34.0, 0.09, 5.0, 1000000.0, 1e-06]

3 Comments

I was a little worried about the units, though: if "[cm]" were "[cm/s]", then the first few terms would be weird..
@DSM: yes, I guess you could validate the split pieced as having digits or use a try block. The OP only seemed concerned about the second match since the chosen regex only works for matches beyond the first field.
That's why I went with myString.split()[-1].split("/")[1], although it will fail on other cases, too. Maybe map(float, myString[myString.find(':')+1:].split("/"))? With only one example to go from, it's hard to know how general to be, or what invariants we can rely on. I do like the idea of using the separator instead of writing a complicated regex to handle all the cases, though.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.