1

How to find a string using regex in Python 3?

textfile.txt

21/02/2018
23/02/2018
yes/2s20/2620 A/RB2
417 A/FOüR COT

Python code

import re
with open('textfile.txt','r') as f: 
     input_file = f.readlines()
b_list = []
for i in input_file:
     s = re.findall(r'^(?=.*/F)(?:[^/\n]*/){1,}[^/\n]*$|^(?=.*A/RB2$)(?:[^/\n]*/){3,}[^/\n]*$',i)
     if len(s) > 0:
        print(s)
        b_list.append(s)
print(b_list,"***********")

Expected Output:

yes/2s20/2620 A/RB2
417 A/FOüR COT
2
  • 3
    What are you searching for? Commented Apr 3, 2019 at 3:27
  • regex condition is correct but when i try to iterate the lines from text file and check the condition it gone wrong i dono why it happens but instead of reading the lines from text file i m giving directly input_file = "yes/2s20/2620 A/RB2","417 A/FOüR COT" like this it runs correctly Commented Apr 3, 2019 at 3:34

2 Answers 2

1

All cleaned up:

import re

b_list = []
match_string = re.compile(r'^(?=.*/F)(?:[^/\n]*/){1,}[^/\n]*$|^(?=.*A/RB2$)(?:[^/\n]*/){3,}[^/\n]*$')
with open('textfile.txt') as f:
    for i in f:
        match = match_string.match(i)
        if match:
            print(match.group(0))
            b_list.append(match.group(0)) #  Unsure what you need in b_list, this will only add the found string

Original answer:

Try putting the for loop under the with statement and removing the need for readlines

import re
with open('textfile.txt','r') as f:
    b_list = []
    for i in f:
         s = re.match(r'^(?=.*/F)(?:[^/\n]*/){1,}[^/\n]*$|^(?=.*A/RB2$)(?:[^/\n]*/){3,}[^/\n]*$',i)
         if s:
            print(s.group(0))
            b_list.append(s)

Can also still use findall just wanted to make it clear was only matching a single item per line. Using your original code:

     s = re.findall(r'^(?=.*/F)(?:[^/\n]*/){1,}[^/\n]*$|^(?=.*A/RB2$)(?:[^/\n]*/){3,}[^/\n]*$',i)
     if len(s) > 0:
        print(s[0])
        b_list.append(s)
Sign up to request clarification or add additional context in comments.

Comments

0

My response is in response to your original question before the edit, but I think is similar enough that you can likely still use it.

import re
d = """
"21/02/2018","23/02/2018","yes/2s20/2620 A/RB2","417 A/FOüR COT"
"""

regexpr1=r'\d\d\/\d\d/\d\d\d\d\"\,\"\d\d\/\d\d\/\d\d\d\d\",\"(.*?)\"'

s = re.findall(regexpr1, d)

print("Results for regexpr1 are")
print(s)

regexpr2=r'\"\,\"(.*?)\"'

s = re.findall(regexpr2, d)

for x in s:
    regexpr=r'\d\d\/\d\d/\d\d\d\d'
    z=re.findall(regexpr, x)
    if(z):
        s.remove(x)

print("Results for regexpr2 are")
print(s)

Output

Results for regexpr1 are
['yes/2s20/2620 A/RB2']
Results for regexpr2 are
['417 A/FOüR COT']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.