Find inside a string in Python

Question

There is a string, it contains numbers and characters.
I need to find an entire number(s) (in that string) that contains number 467033. e.g. 1.467033777777777

Thanks

Is it possible to get few lines of real data to give you more relevant answer? — Tony Veijalainen
– Tony Veijalainen, Commented Mar 22, 2011 at 23:00

samplebias · Accepted Answer · 2011-03-22 22:40:16Z

2

Try this:

import re

RE_NUM = re.compile('(\d*\.\d+)', re.M)

text = 'eghwodugo83o135.13508yegn1.4670337777777773u87208t'
for num in RE_NUM.findall(text):
    if '467033' in num:
        print num

Prints:

1.4670337777777773

Generalized / optimized in response to comment:

def find(text, numbers):
    pattern = '|'.join('[\d.]*%s[\d.]*' % n for n in numbers)
    re_num = re.compile(pattern, re.M)
    return [m.group() for m in re_num.finditer(text)]

print find(text, ['467033', '13'])

Prints:

['135.13508', '1.4670337777777773']

edited Mar 22, 2011 at 22:40

answered Mar 22, 2011 at 22:17

samplebias

38k6 gold badges110 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

jathanism Over a year ago

This is a decent example but possibly costly depending on the dataset. Since you're already scanning the string with re, you might as well bake the sentinel value into the pattern.

samplebias Over a year ago

Yep, I went for simplicity over performance - just updated the answer with something more general, to search for multiple numbers in one pass.

eyquem Over a year ago

I don't understand the remark of jathanism -.- In the first code, I would do: if '467033' in num.replace('.','') The second code prevents to do that. But maybe there is no need to do that, depending of what he wants.

jathanism Over a year ago

I meant that it could be costly as far as the overhead of the operation of 1.) parsing the text with the regex, and then 2.) iterating your matches to perform an in membership test to find the matching substring. Whether or not this matters depends on the size of the dataset you are working with.

Daniel DiPaolo · Accepted Answer · 2011-03-22 21:53:26Z

1

If you're just searching for a substring within another substring, you can use in:

>>> sub_num = "467033"
>>> my_num = "1.467033777777777"
>>> sub_num in my_num
True

However, I suspect there's more to your problem than just searching strings, and that doing it this way might not be optimal. Can you be more specific about what you're trying to do?

answered Mar 22, 2011 at 21:53

Daniel DiPaolo

56.6k14 gold badges120 silver badges117 bronze badges

2 Comments

Nick Over a year ago

Got some data(a lot) from physics computations. I made a weird notice - numbers that contain number 467033 might be very informative to me.

Nick Over a year ago

So, there is a string -> find all the numbers that contain number e.g. 7

highBandWidth · Accepted Answer · 2011-03-22 22:16:27Z

1

import re
a = 'e.g. 1.467033777777777\nand also 576575567467033546.90 Thanks '
r = re.compile('[0-9.]*467033[0-9.]*')
r.findall(a)
['1.467033777777777', '576575567467033546.90']

answered Mar 22, 2011 at 22:16

highBandWidth

17.4k23 gold badges87 silver badges132 bronze badges

1 Comment

highBandWidth Over a year ago

Well, I was assuming that you needed it before or after the decimal. The easiest thing would be to write a number of parallel regexes, and python should compile it optimally anyway.

Collectives™ on Stack Overflow

Find inside a string in Python

3 Answers 3

4 Comments

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related