My objective is to extract sentences from a text file that contain any word that is in my list of keywords. My script cleans up the text file and uses NLTK to tokenize the sentences and remove stopwords. That part of the script works ok and produces output that looks correct ['affirming updated 2020 range guidance long-term earnings dividend growth outlooks provided earlier month', 'finally look forward increasing engagement existing prospective investors months come', 'turn'] The script that I wrote to extract sentences containing a keyword does not work the way I want. It extracts the keywords but not the sentences in which they occur. The output looks like this; [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'impact', 'zone']
fileinC=nltk.sent_tokenize(fileinB)
fileinD=[]
for sent in fileinC:
fileinD.append(' '.join(w for w in word_tokenize(sent) if w not in allinstops))
fileinE=[sent.replace('\n', " ") for sent in fileinD]
#extract sentences containing keywords
fileinF=[]
for sent in fileinE:
fileinF.append(' '.join(w for w in word_tokenize(sent) if w in keywords))