I am trying to extract a particular "float" from a string, it contains multiple formatted "integers", "floats" and dates. The particular "float" in question is presided by some standardized text.
String sample
my_string = """03/14/2019 07:07 AM
💵Soles in mDm : 2864.35⬇
🔶BTC purchase in mdm: 11,202,782.0⬇
"""
I have been able to extract the desired float pattern for, 2864.35, from my_string but if this particular float changes in pattern or another float with the same format shows up, my script won't return the desired result
regex = r"(\d+\.\d+)"
matches = re.findall(regex, my_string)
for match in matches:
print(match)
- It might truncate the desired float because of inconsistent numerical formatting
- It might print two floats because the numerical pattern of an undesired float is too similar to be filtered out by current regular expression
regex
Desired return from regular expression regex
- float with a flexible integer-part, sometimes comma is omitted ie. 45000.50 other times 45,000.50
- unique line identifier:
Solesit could be upper/lower case - line identifier: float prefix
: - it should only return one float
Some variances of desired float in the Second line of the string only
What you see bellow are three examples of the same line, the second line in my_string. The regex should be able to return only line number two despite any variations such as soles or Soles
- 💵Soles in mDm : 2864.35⬇
- soles MDM: 2,864.35
- Soles in mdm :2,864.355
Any assistance in editing or re-writing the current regular expression regex is greatly appreciated
soles.*?(\d[\d,]*\.\d+)with there.Iflag.:preceded by space e.g ` : ` or just:?[S|s]oles.*?(\d[\d,]*\.\d+)or(?i)soles.*?(\d[\d,]*\.\d+)[S|s]oles.*?(\d[\d,]*\.\d+)or(?i)soles.*?(\d[\d,]*\.\d+)