1

I have data that has strings like 'ms2p5', 'ms3', 'ms10' for which I need to extract the digits as convert to numbers as follows.

'ms2p5' => 2.5
'ms3' => 3
'ms10' => 10

I tried the below regex and it is able to get the match. One issue is with values having a character in the middle of the extracted string like '2p5'. What is the right approach to have a generic function that handles all these cases well while converting them into numeric values?

import re
re.search(r'\d+[p]*\d*', str).group() 
2
  • What's wrong with the regex you currently have? Commented Jan 21, 2020 at 3:58
  • 3
    What about ".".join(re.findall(r'\d+', string))? Commented Jan 21, 2020 at 4:02

3 Answers 3

2

Use str.join with re.findall:

los = ['ms2p5', 'ms3', 'ms10']
print([float('.'.join(re.findall('\d+', i))) for i in los])

Output:

[2.5, 3.0, 10.0]
Sign up to request clarification or add additional context in comments.

Comments

2

You could write an extraction function that searched for a numeric value (with or without p for a decimal point, replaced the p with a . and then converted to float. For example:

import re

def extract_num(s):
    return float(re.search(r'\d+p?\d*', s).group().replace('p', '.'))

strs = ['ms2p5', 'ms3', 'ms10']
print([extract_num(s) for s in strs])

Output:

[2.5, 3.0, 10.0]

Comments

1

If the strings all follow the examples you provide, I'd probably just do:

x = 'ms2p5'
float(x[2:].replace('p', '.'))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.