0

I am just testing out a small python script of which I will use part in a larger script. Basically I am trying to lookup a field in a CSV file (where it contains a regex), and use this in a regex test. The reason is (part of a very wierd use-case) and will enable easier maintenance of a CSV file instead of the script. Is there something I am missing with the following....

test.csv:

field0,field1,field2
foo,bar,"\d+\.\d+"
bar,foo,"\w+"

test.py (extra print's used for testing):

import sys
import re
import csv

input = sys.argv[1]
print input

reader = csv.reader(open('test.csv','rb'), delimiter=',', quotechar="\"")
for row in reader:
        print row
        value = row[0]
        print value
        if value in input:
                regex = row[2]
                print regex

                pat = re.compile(regex)
                test = re.match(pat,input)
                out = test.group(1)
                print out

If I pass a value like "foo blah 38902462986.328946239846" to the script, I would expect this to pick up that it contains foo and then use the regex, \d+\.\d+, to extract 38902462986.328946239846. However when I run the script I get the following:

foo blah 0920390239.90239029
['field0', 'field1', 'field2']
field0
['foo', 'bar', '\\d+\\.\\d+']
foo
\d+\.\d+
Traceback (most recent call last):
  File "reg.py", line 19, in <module>
    out = test.group(1)
AttributeError: 'NoneType' object has no attribute 'group'

Not sure what's going on really.

P.S Python is a big world and still learning.

1
  • 2
    Your code seems incorrectly idented. If test is None then re.match failed (that's what it returns on failure). And this might be because re.match expects a string as the first parameter, not a compiled pattern. Commented Oct 17, 2012 at 11:37

1 Answer 1

1

According to the docs re.match matches at the beginning of the input string. You need to use re.search. Also, there's no need to compile if you don't reuse them afterwards. Just say test = re.search(regex, input).

In the regular expressions in your example you don't have any capture groups, so test.group(1) is going to fail, even if there's a match in the input.

import sys
import re
import csv

input = 'foo blah 38902462986.328946239846'

reader = csv.reader(open('test.csv','rb'), delimiter=',', quotechar="\"")
for row in reader:
    value = row[0]
    if value in input:
        regex = row[2]
        test = re.search(regex, input)
        print input[test.start():test.end()]

Prints:

38902462986.328946239846
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, that did the trick, it's probably because I'd used match previously so that stuck in my mind.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.