0

String:

  1. "Roaming Calls, 1.5 GB/Day 100 SMS/Day"
  2. "Unlimited Loc/STD/Roaming Calls, 1GB/Day"

I want to get the "1.5" and "1" by regex.

I use r'.*([0-9.]+)(gb|GB| gb| GB)' but only get "5" matched for the case 1.

2
  • Can you mention your full code here ? Commented Feb 7, 2018 at 5:48
  • .* is greedy and looks to backtrack with the higher priority. For what I see in your question it is not needed at all. You should skip it, using only (\d+(?:\.\d+)?)\s?(gb|GB). Commented Feb 7, 2018 at 5:56

5 Answers 5

2

use Lookahead after the match to locate the float number before string GB/Day(case insensitive): (?= GB/Day)

[\d.]+(?= GB/Day|GB/Day| gb/day|gb/day)

Regex101 Demo

Sign up to request clarification or add additional context in comments.

Comments

0

Please try this regex. This will match Non space characters begore GB

r'\S+(?=\s*(GB|gb|Gb|gB))'

Comments

0

Here is a fix to the immediate problem with your pattern:

input = "Roaming Calls, 1.5 GB/Day 100 SMS/Day"
m0 = re.match(r'.*?([0-9.]+)?(gb|GB| gb| GB)', input)
if m0:
    print "match: ", m0.group(1)

Just make the dot appearing right before the capture group for the number lazy.

Demo

Comments

0

For both float and other numbers you can try this :

import re
k = "Roaming Calls, 1.5 GB/Day 100 SMS/Day"
print(re.findall(r"[-+]?\d*\.\d+|\d+",k))

if you want to find only float values go for this :

import re
k = "Roaming Calls, 1.5 GB/Day 100 SMS/Day"
print(re.findall(r"[-+]?\d*\.\d+",k))

it will return a list of float numbers in that string like this :

['1.5']

Comments

0

The issue that .* matches everything and leaves only 1 symbol for [0-9.]+. You can replace it with .? so it won't be that greedy:

.?([0-9.]+)(gb|GB| gb| GB)

regex101

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.