2

Building off this answer, is there a way to filter a Pandas dataframe by a list of substrings?

Say I want to find all rows where df['menu_item'] contains fresh or spaghetti

Without something like this:

df[df['menu_item'].str.contains('fresh') | (df['menu_item'].str.contains('spaghetti')]

1
  • Would you consider using a custom function with a map ? Commented Nov 16, 2016 at 20:41

2 Answers 2

5

The str.contains method you're using accepts regex, so use the regex | as or:

df[df['menu_item'].str.contains('fresh|spaghetti')]

Example Input:

          menu_item
0        fresh fish
1      fresher fish
2           lasagna
3     spaghetti o's
4  something edible

Example Output:

       menu_item
0     fresh fish
1   fresher fish
3  spaghetti o's
Sign up to request clarification or add additional context in comments.

3 Comments

winner winner spaghetti dinner
Cool solution :-)
Can it be used for a list where substrings are members of the list?
0
import pandas as pd
sample_df = pd.DataFrame({'menu_item': ['fresh fish', 'lasagna', 'spaghetti o\'s', 'fresher fish', 'something edible']})

filter_list = ['fresh, 'spaghetti']

filter_df = sample_df[sample_df['menu_item'].str.contains('|'.join(filter_list), na=False, case=False)]

Example Input:

          menu_item
0        fresh fish
1      fresher fish
2           lasagna
3     spaghetti o's
4  something edible

Output:

       menu_item
0     fresh fish
1   fresher fish
3  spaghetti o's

sample code screenshot from notebook

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.