Using a dictionary to count the items in a list

Question

Suppose I have a list of items, like:

['apple', 'red', 'apple', 'red', 'red', 'pear']

I want a dictionary that counts how many times each item appears in the list. So for the list above the result should be:

{'apple': 2, 'red': 3, 'pear': 1}

How can I do this simply in Python?

_{If you are only interested in counting instances of a single element in a list, see How do I count the occurrences of a list item?.}

you can get inspiration here: stackoverflow.com/questions/2870466/python-histogram-one-liner — mykhal
– mykhal, Commented Aug 16, 2010 at 19:23

Daniel Walker · Accepted Answer · 2022-07-18 03:17:24Z

414

In 2.7 and 3.1, there is the special Counter (dict subclass) for this purpose.

>>> from collections import Counter
>>> Counter(['apple','red','apple','red','red','pear'])
Counter({'red': 3, 'apple': 2, 'pear': 1})

edited Jul 18, 2022 at 3:17

Daniel Walker

6,9067 gold badges25 silver badges65 bronze badges

answered Aug 16, 2010 at 20:00

Odomontois

16.4k2 gold badges39 silver badges74 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Muhammad Alkarouri Over a year ago

The official line, or rather standing joke, is that Guido has a time machine ..

awesomo Over a year ago

@Glenn Maynard Counter is just an implementation of a multiset which is not an uncommon data structure IMO. In fact, C++ has an implementation in the STL called std::multiset (also std::tr1::unordered_multiset) so Guido is not alone in his opinion of its importance.

Glenn Maynard Over a year ago

@awesomo: No, it's not comparable to std::multiset. std::multiset allows storing multiple distinct but comparatively equal values, which is what makes it so useful. (For example, you can compare a list of locations by their temperature, and use a multiset to look up all locations at a specific temperature or temperature range, while getting the fast insertions of a set.) Counter merely counts repetitions; distinct values are lost. That's much less useful--it's nothing more than a wrapped dict. I question calling that a multiset at all.

awesomo Over a year ago

@GlennMaynard You're right, I overlooked the additional (extremely useful) features of std::multiset.

Radio Controlled Over a year ago

Counting might be a narrow task, but one that is required very often.

|

mmmdreg · Accepted Answer · 2013-05-22 06:41:38Z

360

I like:

counts = dict()
for i in items:
  counts[i] = counts.get(i, 0) + 1

.get allows you to specify a default value if the key does not exist.

edited May 22, 2013 at 6:41

answered Jul 5, 2011 at 12:44

mmmdreg

6,6682 gold badges26 silver badges19 bronze badges

9 Comments

curiousMonkey Over a year ago

For those new to python. This answer is better in terms of time complexity.

SherylHohman Over a year ago

This answer works even on a list of floating point numbers, where some of the numbers may be '0'

Hayden Over a year ago

This answer also does not require any extra imports. +1

Jonas Palačionis Over a year ago

I don't understand what does the +1 part does. Could someone explain?

Peter Cordes Over a year ago

@JonasPalačionis: It increments the counter for that key, before assigning back to the value for that key. i.e. it's a histogram aka frequency-count.

|

jfMR · Accepted Answer · 2018-08-21 13:20:00Z

78

Simply use list property count\

i = ['apple','red','apple','red','red','pear']
d = {x:i.count(x) for x in i}
print d

output :

{'pear': 1, 'apple': 2, 'red': 3}

edited Aug 21, 2018 at 13:20

jfMR

25.2k5 gold badges69 silver badges88 bronze badges

answered Mar 29, 2016 at 12:24

Ashish Kumar Verma

1,3882 gold badges13 silver badges22 bronze badges

3 Comments

Ouroborus Over a year ago

You're applying count against the array as many times as there are array items. Your solution is O(n^2) where the better trivial solution is O(n). See comments on riviera's answer versus comments on mmdreg's answer.

Xenia Ioannidou Over a year ago

Maybe you could do d = {x:i.count(x) for x in set(i)}

Peter Cordes Over a year ago

@XeniaIoannidou: That does O(n * unique_elements) work; not much better unless you have many repeats. And still bad; building a set() is basically adding elements to a hash table without a count. Almost as much work as just adding them to a Dictionary of counts and incrementing the count if already present, and that's just for making the set. What I described for adding to a Dictionary is already a full solution to the histogram problem, and you're done there without any time spent scanning the original array for each unique element.

mechanical_meat · Accepted Answer · 2010-08-16 19:22:15Z

67

>>> L = ['apple','red','apple','red','red','pear']
>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> for i in L:
...   d[i] += 1
>>> d
defaultdict(<type 'int'>, {'pear': 1, 'apple': 2, 'red': 3})

answered Aug 16, 2010 at 19:22

mechanical_meat

170k25 gold badges237 silver badges231 bronze badges

2 Comments

Shadow Over a year ago

@NickT It's more cluttered than itertools.Counter - and I'd be surprised if it was faster...

Intrastellar Explorer Over a year ago

By itertools.Counter I think @Shadow meant collections.Counter

Stefano Palazzo · Accepted Answer · 2010-08-17 12:25:12Z

34

I always thought that for a task that trivial, I wouldn't want to import anything. But i may be wrong, depending on collections.Counter being faster or not.

items = "Whats the simpliest way to add the list items to a dictionary "

stats = {}
for i in items:
    if i in stats:
        stats[i] += 1
    else:
        stats[i] = 1

# bonus
for i in sorted(stats, key=stats.get):
    print("%d×'%s'" % (stats[i], i))

I think this may be preferable to using count(), because it will only go over the iterable once, whereas count may search the entire thing on every iteration. I used this method to parse many megabytes of statistical data and it always was reasonably fast.

answered Aug 17, 2010 at 12:25

Stefano Palazzo

4,3612 gold badges31 silver badges41 bronze badges

3 Comments

ntk4 Over a year ago

Your answer deserves more credit for it's simplicity. I was struggling over this for a while, getting bewildered with the silliness of some of the other users suggesting to import new libraries etc.

merhoo Over a year ago

you could simplify it with a default value like this d[key] = d.get(key, 0) + 1

MadhaviJ Over a year ago

The simplicity of this answer is so underrated! Sometimes there is no need to import libraries and over-engineer simple tasks.

Nick T · Accepted Answer · 2010-08-17 21:25:09Z

6

L = ['apple','red','apple','red','red','pear']
d = {}
[d.__setitem__(item,1+d.get(item,0)) for item in L]
print d

Gives {'pear': 1, 'apple': 2, 'red': 3}

edited Aug 17, 2010 at 21:25

answered Aug 16, 2010 at 19:24

Nick T

26.9k14 gold badges88 silver badges128 bronze badges

1 Comment

Karl Knechtel Over a year ago

Please don't abuse list comprehensions for side effects like this. The imperative loop is much clearer, and does not create a useless temporary list of Nones.

Karl Knechtel · Accepted Answer · 2022-07-30 21:33:34Z

If you use Numpy, the unique function can tell you how many times each value appeared by passing return_counts=True:

>>> data = ['apple', 'red', 'apple', 'red', 'red', 'pear']
>>> np.unique(data, return_counts=True)
(array(['apple', 'pear', 'red'], dtype='<U5'), array([2, 1, 3]))

The counts are in the same order as the distinct elements that were found; thus we can use the usual trick to create the desired dictionary (passing the two elements as separate arguments to zip):

>>> dict(zip(*np.unique(data, return_counts=True)))
{'apple': 2, 'pear': 1, 'red': 3}

If you specifically have a large input Numpy array of small integers, you may get better performance from bincount:

>>> data = np.random.randint(10, size=100)
>>> data
array([1, 0, 0, 3, 3, 4, 2, 4, 4, 0, 4, 8, 7, 4, 4, 8, 7, 0, 0, 2, 4, 2,
       0, 9, 0, 2, 7, 0, 7, 7, 5, 6, 6, 8, 4, 2, 7, 6, 0, 3, 6, 3, 0, 4,
       8, 8, 9, 5, 2, 2, 5, 1, 1, 1, 9, 9, 5, 0, 1, 1, 9, 5, 4, 9, 5, 2,
       7, 3, 9, 0, 1, 4, 9, 1, 1, 5, 4, 7, 5, 0, 3, 5, 1, 9, 4, 8, 8, 9,
       7, 7, 7, 5, 6, 3, 2, 4, 3, 9, 6, 0])
>>> np.bincount(data)
array([14, 10,  9,  8, 14, 10,  6, 11,  7, 11])

The nth value in the output array indicates the number of times that n appeared, so we can create the dictionary if desired using enumerate:

>>> dict(enumerate(np.bincount(data)))
{0: 14, 1: 10, 2: 9, 3: 8, 4: 14, 5: 10, 6: 6, 7: 11, 8: 7, 9: 11}

Leo negao · Accepted Answer · 2023-08-02 12:26:53Z

1

That is an easy answer m8!

def equalizeArray(arr):
    # Counting the frequency of each element in the array
    freq = {}
    for i in arr:
        if i not in freq:
            freq[i] = 1
        else:
            freq[i] += 1
    # Finding the element with the highest frequency
    max_freq = max(freq.values())
    # Calculating the number of deletions required
    for key,value in freq.items():
        if value == max_freq:
            print(key,"been repeated:",value,"times")

edited Aug 2, 2023 at 12:26

user4136999

answered Jul 28, 2023 at 18:54

Leo negao

113 bronze badges

1 Comment

Community Over a year ago

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Alez · Accepted Answer · 2023-12-07 12:11:40Z

First create a list of elements to count

elements = [1, 2, 3, 2, 1, 3, 2, 1, 1, 4, 5, 4, 4]

make an empty dictionary, we also do the same with lists

>>> counts = {}

create a for loop that will count for the occurrences of the "key", for each occurrence we add 1

for element in elements:
   if element in counts:
   counts[element] +=1

check whether we encountered the key before or not if so we add 1, if not we use "else" so the new key is added to the dictionary.

>>> else:
>>> counts[element] = 1

Now print counts using 'items() so we could create sequence of the key-value pairs

for element, count in counts.items():
   print(element, ":", count)

here the items() method shows us the key-value pairs, as if we ask: "for element aka in the element load the previous data into 'count' and update it into a sequence of key-value pairs which is what the item() does

CODE WITH NO COMMENTARY:

    elements = [1, 2, 3, 2, 1, 3, 2, 1, 1, 4, 5, 4, 4]
    counts = {}
    for element in elements:
        if element in counts:
            counts[element] += 1
        else:
            counts[element] = 1
    
    for element, count in counts.items():
        print(element, ":", count)

OUTPUT:
    1:4
    2:3
    3:2
    4:3
    5:1

Harry Rashid · Accepted Answer · 2023-02-07 20:28:49Z

-1

mylist = [1,2,1,5,1,1,6,'a','a','b']
result = {}
for i in mylist:
    result[i] = mylist.count(i)
print(result)

edited Feb 7, 2023 at 20:28

answered Feb 7, 2023 at 20:25

Harry Rashid

11 bronze badge

2 Comments

General Grievance Over a year ago

No, not a good idea. Runtime complexity is O(n^2) which pretty much defeats the point of using the dictionary in the first place. Same problem as this answer: stackoverflow.com/a/36284223

Stephen Ostermiller Over a year ago

A code-only answer is not high quality. While this code may be useful, you can improve it by saying why it works, how it works, when it should be used, and what its limitations are. Please edit your answer to include explanation and link to relevant documentation.

Collectives™ on Stack Overflow

Using a dictionary to count the items in a list

10 Answers 10

8 Comments

9 Comments

3 Comments

2 Comments

3 Comments

1 Comment

Comments

1 Comment

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

8 Comments

9 Comments

3 Comments

2 Comments

3 Comments

1 Comment

Comments

1 Comment

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related