0

I am new to python, I have been using regex for matching, etc. Now I am facing a small issue with it, I have a string str = "vans>=20.09 and stands == 'four'". I want the values after the Comparison Operators, I have used regex to extract that the pattern which I gave is working fine in extracting the values like int and strings but it is not extracting the float values. What is the best pattern, so that regex will extract all kind of values(int, float, strings)?

My code:

import re
str = "vans>=20.09 and stands == 'four'" 
rx = re.compile(r"""(?P<key>\w+)\s*[<>=]+\s*'?(?P<value>\w+)'?""")
result = {m.group('key'): m.group('value') for m in rx.finditer(str)}

which gives:

{'vans': '20', 'stands': 'four'}

Expected Output:

{'vans': '20.09', 'stands': 'four'}
2
  • 1
    I think you need rx = re.compile(r"""(?P<key>\w+)\s*[<>=]+\s*'?(?P<value>\d*\.?\d+|\w+)'?"""), see the Python demo. Commented Feb 25, 2021 at 10:29
  • 1
    and if \d does not include \. then you might want to make it [\d.]* or such if you dont want to match more than 1 \. Commented Feb 25, 2021 at 10:31

2 Answers 2

2

You can extend the second \w group with an \. to include dots.

rx = re.compile(r"""(?P<key>\w+)\s*[<>=]+\s*'?(?P<value>[\w\.]+)'?""")

This should work fine, strings like 12.34.56 will also be matched as value.

Sign up to request clarification or add additional context in comments.

Comments

1

There is a problem in identifying the comparison operators as well. The following should suffice all use cases. However, there is a caveat - for numbers with no digit following the decimal, only the value before the decimal point will be selected.

rx = re.compile(r"""(?P<key>\w+)\s*[<>!=]=\s*'?(?P<value>(\w|[+-]?\d*(\.\d)?\d*)+)'?""")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.