0

I have a list with keywords id = ['pop','ppp','cre'] and now I am going through a bunch of files/large strings and if any of these keywords are in these files than I have to be able to do something...

like:

id = ['pop','ppp','cre']
if id in dataset:
         print id

But i think now all of these 3 or later maybe more have to be in the dataset and not just only one.

2
  • Do you want to match any of the keywords in id, or all of the keywords in id? Commented Aug 18, 2011 at 11:36
  • 1
    @Jasper: You shouldn't use id as a variable name, since it shadows the builtin function id(). Commented Aug 18, 2011 at 12:15

3 Answers 3

1

You can use all to make sure all the values in your id list are in the dataset:

id = ['pop', 'ppp', 'cre']
if all(i in dataset for i in id):
    print id
Sign up to request clarification or add additional context in comments.

5 Comments

The question specifies that any, not all, should be in the dataset
"But i think now all of these 3 or later maybe more have to be in the dataset and not just only one."
"if any of these keywords are in these files than I have to be able to do something". I think that quote refers to what he believes his current code is doing.
... yes, which is in fact the problem that needs solving.
I'll ask him to clarify. We are clearly understanding his problem statement differently.
1

Since you mentioned that you need to check any word within dataset then I think any() built-in method will help:

if any(word in dataset for word in id):
    # do something

Or:

if [word for word in id if word in dataset]:
    # do something

And:

if filter(lambda word: word in dataset, id):
    # do something

Comments

1

Your code as it stands will actually look through dataset for the entire list "['pop', 'ppp', 'cre']". Why don't you try something like this:

for item in id:
    if item in dataset:
        print id

Edit:

This will probably be more efficient:

for item in dataset:
    if item in id:
        print id

Assuming |dataset| > |id| and you break out of the loop when you find a match.

6 Comments

He says dataset is a bunch of files/large strings. I'm not sure that iterating over it will be helpful.
Look at the documentation for all(): docs.python.org/library/functions.html?highlight=any#all , that's exactly what the function does. How would you propose to check if an element is in dataset without iterating over it?
Iterating over a string returns the values one character at a time. So no, that's not going to work.
Its iterating over a "bunch of files/large strings", as you said, not a string.
OK, so if it's a list of strings how is that better? You're checking if the whole large string is in id?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.