For my Information Retrieval class I have to make an index of terms from a group of files. Valid terms contain an alphabetical character, so to test I just made a simple function and use an if/then control statement. Thus far I have:
ALPHA = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
def content_test(term):
for a in ALPHA:
if a in term:
return True
return False
class FileRead():
def __init__(self, filename):
f = open(filename, 'r')
content = f.read()
self.terms = content.split()
def clean(self):
for term in self.terms:
if content_test(term) is False:
try:
terms.remove(term)
except:
pass
Now this all works fine (I think...) however I've been trying to learn more higher level python and I can't help but think that there is a more pythonic way of checking term validity (maybe using map(), or a lambda function?).
Am I correct or am I just overthinking it?
import string; ALPHA = string.lowercase.