Not sure if this is a 'filtering with pandas' question or one of text analysis, however:
Given a df,
d = {
"item": ["a", "b", "c", "d"],
"report": [
"john rode the subway through new york",
"sally says she no longer wanted any fish, but",
"was not submitted",
"the doctor proceeded to call washington and new york",
],
}
df = pd.DataFrame(data=d)
df
Resulting in
item, report
a, "john rode the subway through new york"
b, "sally says she no longer wanted any fish, but"
c, "was not submitted"
d, "the doctor proceeded to call washington and new york"
And a list of terms to match:
terms = ["new york", "fish"]
How would you reduce the the df to have the following rows, based on whether a substring in terms is found in column report and so that item is preserved?
item, report
a, "john rode the subway through new york"
b, "sally says she no longer wanted any fish, but"
d, "the doctor proceeded to call washington and new york"