I want to build a regex which captures all patterns inside a string where an integer or a floating point number is present before an unit of measurement(ml,mg,kg etc). My current regex only considers integers and breaks when there's a space. I want to handle these in my code.
p = re.compile('[0-9](?:mg|kg|ml|q.s.|ui|M|g|µg)')
x = '0.9mg is the approximate dosage'
z = p.findall(x)
print(z)
which doesn't work for decimals and also breaks when there's a space.
Expected patterns to be captured are:
Examples: 0.9 mg, 9 mg, 9mg, 0.9mg
Any help regarding this
Using the regex in the code:
mg = []
newregex = r"[0-9\.\s]+(?:mg|kg|ml|q.s.|ui|M|g|µg)"
for s in zz:
for e in extracteddata:
v = re.search(newregex,extracteddata,flags=re.IGNORECASE|re.MULTILINE)
if v:
mg.append(v.group(0))