I'm trying to create a nested dictionary in Python so that, given a list of strings, the dictionary records the number of occurrences of that string order.
For example, if the string list is:
["hey", "my", "name", "is"]
I would want the nested dictionary to look like:
{"hey": {"my": {"name": {"is": 1}}}}
I know I could probably use as key the whole list but I specifically want to separate the strings in the dictionary.
I also would want to approach this problem with a defaultdict dictionary, not the Python ones, and preferably use a recursively defined defaultdict.
This is what I tried:
from collections import defaultdict
nested_dict = lambda: defaultdict(nested_dict)
# Initialize ngrams as a nested defaultdict
ngrams = nested_dict()
# Function to update the nested defaultdict with the list of words
def update_ngrams(ngrams, words):
current_dict = ngrams
for word in words[:-1]:
current_dict = current_dict[word]
current_dict[words[-1]] += 1
# Example usage
update_ngrams(ngrams, ["My", "big", "cat"])
update_ngrams(ngrams, ["My", "big", "dog"])
but it gives me this error:
TypeError: unsupported operand type(s) for +=: 'collections.defaultdict' and 'int'
The expected output should be a map like this:
{"My": {"big": {"cat": 1, "dog": 1}}}
defaultdictalone doesn't really work for this, because (as your error suggests) you want the leaves to be integers, not themselvesdefaultdicts. You can't unconditionallycurrent_dict[words[-1]] += 1.update_ngrams(ngrams, ["My", "big", "cat", "Garfield"])?