How to find a string using regex in Python 3?

Question

textfile.txt

21/02/2018
23/02/2018
yes/2s20/2620 A/RB2
417 A/FOüR COT

Python code

import re
with open('textfile.txt','r') as f: 
     input_file = f.readlines()
b_list = []
for i in input_file:
     s = re.findall(r'^(?=.*/F)(?:[^/\n]*/){1,}[^/\n]*$|^(?=.*A/RB2$)(?:[^/\n]*/){3,}[^/\n]*$',i)
     if len(s) > 0:
        print(s)
        b_list.append(s)
print(b_list,"***********")

Expected Output:

yes/2s20/2620 A/RB2
417 A/FOüR COT

regex condition is correct but when i try to iterate the lines from text file and check the condition it gone wrong i dono why it happens but instead of reading the lines from text file i m giving directly input_file = "yes/2s20/2620 A/RB2","417 A/FOüR COT" like this it runs correctly — pythoncoder
– pythoncoder, Commented Apr 3, 2019 at 3:34

CasualDemon · Accepted Answer · 2019-04-03 03:47:29Z

All cleaned up:

import re

b_list = []
match_string = re.compile(r'^(?=.*/F)(?:[^/\n]*/){1,}[^/\n]*$|^(?=.*A/RB2$)(?:[^/\n]*/){3,}[^/\n]*$')
with open('textfile.txt') as f:
    for i in f:
        match = match_string.match(i)
        if match:
            print(match.group(0))
            b_list.append(match.group(0)) #  Unsure what you need in b_list, this will only add the found string

Original answer:

Try putting the for loop under the with statement and removing the need for readlines

import re
with open('textfile.txt','r') as f:
    b_list = []
    for i in f:
         s = re.match(r'^(?=.*/F)(?:[^/\n]*/){1,}[^/\n]*$|^(?=.*A/RB2$)(?:[^/\n]*/){3,}[^/\n]*$',i)
         if s:
            print(s.group(0))
            b_list.append(s)

Can also still use findall just wanted to make it clear was only matching a single item per line. Using your original code:

     s = re.findall(r'^(?=.*/F)(?:[^/\n]*/){1,}[^/\n]*$|^(?=.*A/RB2$)(?:[^/\n]*/){3,}[^/\n]*$',i)
     if len(s) > 0:
        print(s[0])
        b_list.append(s)

user11303158 · Accepted Answer · 2019-04-03 04:14:01Z

0

My response is in response to your original question before the edit, but I think is similar enough that you can likely still use it.

import re
d = """
"21/02/2018","23/02/2018","yes/2s20/2620 A/RB2","417 A/FOüR COT"
"""

regexpr1=r'\d\d\/\d\d/\d\d\d\d\"\,\"\d\d\/\d\d\/\d\d\d\d\",\"(.*?)\"'

s = re.findall(regexpr1, d)

print("Results for regexpr1 are")
print(s)

regexpr2=r'\"\,\"(.*?)\"'

s = re.findall(regexpr2, d)

for x in s:
    regexpr=r'\d\d\/\d\d/\d\d\d\d'
    z=re.findall(regexpr, x)
    if(z):
        s.remove(x)

print("Results for regexpr2 are")
print(s)

Output

Results for regexpr1 are
['yes/2s20/2620 A/RB2']
Results for regexpr2 are
['417 A/FOüR COT']

answered Apr 3, 2019 at 4:14

user11303158

Collectives™ on Stack Overflow

How to find a string using regex in Python 3?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related