I am learning python for my job to be able to manipulate statistical data. I already have a knowledge of C# and javascript and can solve this issue using these languages however I'm having difficulty translating the solution to python.
THE ISSUE Count all unique four letter words in a .txt file. Any word with an apostrophe in should be ignored. Ignore the case of the word (i.e. Tool and tool should only be counted as one word). Print out (so that the user can see) the number of unique four letter words.
Divide up the four letter words based upon the last two letters of the word (the word ending). Count up how many words you have for each of these endings.
Print out a list of word endings and the number of words you found for each ending.
I have solved this issue in Javascript below:
var listOfWords = ['card','alma','soon','bard','moon','dare'];
var groupings = {};
for(var i = 0; i < listOfWords.length; i++);
{
var ending = listOfWords[i].substring(2,4)
if(groupings[ending] === undefined)
{
groupings[ending] = {}
groupings[ending].words = []
groupings[ending].count = 0
}
groupings[ending].words.push(listOfWords[i])
groupings[ending].count++
};
console.debug(groupings);
Here is what I have so far in python:
import re
text = open("words.txt")
regex = re.compile(r'\b\w{4}\b')
allFours = []
groupings = []
for line in text:
four_letter_words = regex.findall(line)
for word in four_letter_words:
allFours.append(word)
mylist = list(dict.fromkeys(allFours))
uniqueWordCount = len(mylist)
print(uniqueWordCount)
for i = 0; i < mylist.length; i++:
var ending = mylist[i]
I hope I have explained everything clearly any questions just ask. All help is greatly appreciated, thank you.
varkeyword; its for loop syntax is different) what actually is your question?