How to count identical string elements in a Ruby array

Question

I have the following Array = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]

How do I produce a count for each identical element?

Where:
"Jason" = 2, "Judah" = 3, "Allison" = 1, "Teresa" = 1, "Michelle" = 1?

or produce a hash Where:

Where: hash = { "Jason" => 2, "Judah" => 3, "Allison" => 1, "Teresa" => 1, "Michelle" => 1 }

As of Ruby 2.7 you can use Enumerable#tally. More info here. — SRack
– SRack, Commented Jun 7, 2019 at 11:37

Tom Lord · Accepted Answer · 2023-09-26 10:58:31Z

204

Ruby v2.7+ (latest)

As of ruby v2.7.0 (released December 2019), the core language now includes Enumerable#tally - a new method, designed specifically for this problem:

names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]

names.tally
#=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}

Ruby v2.4+ (EOL ruby version, but still valid code on newer versions)

The following code was not possible in standard ruby when this question was first asked (February 2011), as it uses:

Object#itself, which was added to Ruby v2.2.0 (released December 2014).
Hash#transform_values, which was added to Ruby v2.4.0 (released December 2016).

These modern additions to Ruby enable the following implementation:

names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]

names.group_by(&:itself).transform_values(&:count)
#=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}

Ruby v2.2+ (EOL ruby version, but still valid code on newer versions)

For even older ruby versions, without access to the above mentioned Hash#transform_values method, you could instead use Array#to_h, which was added to Ruby v2.1.0 (released December 2013):

names.group_by(&:itself).map { |k,v| [k, v.length] }.to_h
#=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}

For even older ruby versions (<= 2.1), there are several ways to solve this, but (in my opinion) there is no clear-cut "best" way. See the other answers to this post.

edited Sep 26, 2023 at 10:58

answered Jan 1, 2018 at 22:24

Tom Lord

28.4k5 gold badges54 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Sagar Pandya Over a year ago

I was about to post :P. Is there any discernible difference between using count instead of size/length?

Tom Lord Over a year ago

@SagarPandya No, there is no difference. Unlike Array#size and Array#length, Array#count can take an optional argument or block; but if used with neither then its implementation is identical. More specifically, all three methods call LONG2NUM(RARRAY_LEN(ary)) under the hood: count / length

Abram Over a year ago

Extra credit! Sort by count .group_by(&:itself).transform_values(&:count).sort_by{|k, v| v}.reverse

Sony Santos Over a year ago

@Abram you can sort_by{ |k, v| -v}, no reverse needed! ;-)

Mauricio · Accepted Answer · 2011-02-26 16:55:13Z

133

names.inject(Hash.new(0)) { |total, e| total[e] += 1 ;total}

gives you

{"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}

answered Feb 26, 2011 at 16:55

Mauricio

5,8602 gold badges30 silver badges34 bronze badges

4 Comments

user166390 Over a year ago

+1 Like the selected answer, but I prefer the use of inject and no "external" variable.

mfilej Over a year ago

If you use each_with_object instead of inject you don't have to return (;total) at the block.

Gon Zifroni Over a year ago

For posterity, this is what @mfilej means: array.each_with_object(Hash.new(0)){|string, hash| hash[string] += 1}

Hallgeir Wilhelmsen Over a year ago

From Ruby 2.7, you can simply do: names.tally.

Dylan Markow · Accepted Answer · 2011-02-26 16:44:40Z

95

names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
counts = Hash.new(0)
names.each { |name| counts[name] += 1 }
# => {"Jason" => 2, "Teresa" => 1, ....

answered Feb 26, 2011 at 16:44

Dylan Markow

125k27 gold badges287 silver badges202 bronze badges

Comments

notapatch · Accepted Answer · 2016-04-14 18:18:02Z

28

Now using Ruby 2.2.0 you can leverage the itself method.

names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
counts = {}
names.group_by(&:itself).each { |k,v| counts[k] = v.length }
# counts > {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}

edited Apr 14, 2016 at 18:18

notapatch

7,3136 gold badges44 silver badges54 bronze badges

answered Jan 2, 2016 at 12:02

Ahmed Fahmy

4865 silver badges10 bronze badges

3 Comments

Andy Day Over a year ago

Agree, but I slightly prefer names.group_by(&:itself).map{|k,v| [k, v.count]}.to_h so that you don't have to ever declare a hash object

Tom Lord Over a year ago

@andrewkday Taking this one step further, ruby v2.4 added the method: Hash#transform_values which allows us to simplify your code even more: names.group_by(&:itself).transform_values(&:count)

Tom Lord Over a year ago

Also, this is a very subtle point (which is likely no longer relevant to future readers!), but note that your code also uses Array#to_h - which was added to Ruby v2.1.0 (released December 2013 - i.e. almost 3 years after the original question was asked!)

Tomerikoo · Accepted Answer · 2021-12-03 11:35:10Z

26

Ruby 2.7+

Ruby 2.7 is introducing Enumerable#tally for this exact purpose. There's a good summary here.

In this use case:

array.tally
# => { "Jason" => 2, "Judah" => 3, "Allison" => 1, "Teresa" => 1, "Michelle" => 1 }

Docs on the features being released are here.

edited Dec 3, 2021 at 11:35

Tomerikoo

19.6k16 gold badges57 silver badges68 bronze badges

answered Jun 7, 2019 at 11:36

SRack

12.4k7 gold badges55 silver badges66 bronze badges

1 Comment

tadman Over a year ago

Fantastic news!

Jörg W Mittag · Accepted Answer · 2011-02-26 17:51:36Z

17

There's actually a data structure which does this: MultiSet.

Unfortunately, there is no MultiSet implementation in the Ruby core library or standard library, but there are a couple of implementations floating around the web.

This is a great example of how the choice of a data structure can simplify an algorithm. In fact, in this particular example, the algorithm even completely goes away. It's literally just:

Multiset.new(*names)

And that's it. Example, using https://GitHub.Com/Josh/Multimap/:

require 'multiset'

names = %w[Jason Jason Teresa Judah Michelle Judah Judah Allison]

histogram = Multiset.new(*names)
# => #<Multiset: {"Jason", "Jason", "Teresa", "Judah", "Judah", "Judah", "Michelle", "Allison"}>

histogram.multiplicity('Judah')
# => 3

Example, using http://maraigue.hhiro.net/multiset/index-en.php:

require 'multiset'

names = %w[Jason Jason Teresa Judah Michelle Judah Judah Allison]

histogram = Multiset[*names]
# => #<Multiset:#2 'Jason', #1 'Teresa', #3 'Judah', #1 'Michelle', #1 'Allison'>

answered Feb 26, 2011 at 17:51

Jörg W Mittag

371k79 gold badges457 silver badges666 bronze badges

4 Comments

Andrew Grimm Over a year ago

Does the MultiSet concept originate from mathematics, or another programming language?

Jörg W Mittag Over a year ago

@Andrew Grimm: Both he word "multiset" (de Bruijn, 1970s) and the concept (Dedekind 1888) originated in mathematics. Multiset is governed by strict mathematical rules and supports the typical set operations (union, intersection, complement, ...) in a way that is mostly consistent with the axioms, laws and theorems of "normal" mathematical set theory, although some important laws do not hold when you try to generalize them to multisets. But that's way beyond my understanding of the matter. I use them as a programming data structure, not as a mathematical concept.

Tom Lord Over a year ago

To expand a little on that point: "... in a way that is mostly consistent with the axioms ...": "Normal" sets are usually formally defined by a set of axioms (assumptions) called "Zermelo-Frankel set theory". However, one of these axioms: the axiom of extensionality states that a set is defined precisely by its members - e.g. {A, A, B} = {A, B}. This is clearly a violation of the very definition of multi-sets!

Tom Lord Over a year ago

...However, without going into too much detail (as this is a software forum, not advanced maths!), one can formally define multi-sets mathematically via axioms for Crisp sets, the Peano axioms and other MultiSet-specific axioms.

Lev Lukomskyi · Accepted Answer · 2018-07-24 14:03:29Z

14

Enumberable#each_with_object saves you from returning the final hash.

names.each_with_object(Hash.new(0)) { |name, hash| hash[name] += 1 }

Returns:

=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}

edited Jul 24, 2018 at 14:03

Lev Lukomskyi

6,7574 gold badges39 silver badges29 bronze badges

answered Apr 14, 2015 at 2:37

Anconia

4,0686 gold badges40 silver badges66 bronze badges

1 Comment

Lev Lukomskyi Over a year ago

Agree, each_with_object variant is more readable to me than inject

Shreyas · Accepted Answer · 2011-02-27 06:29:05Z

6

This works.

arr = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
result = {}
arr.uniq.each{|element| result[element] = arr.count(element)}

edited Feb 27, 2011 at 6:29

answered Feb 26, 2011 at 16:55

Shreyas

8,7777 gold badges46 silver badges56 bronze badges

2 Comments

user166390 Over a year ago

+1 For a different approach -- although this has worse theoretical complexity -- O(n^2) (which will matter for some values of n) and does extra work (it has to count for "Judah" 3x, for instance)!. I would also suggest each instead of map (the map result is being discarded)

Shreyas Over a year ago

Thanks for that! I've changed the map to each.Also, I've uniq'ed the array before going through it. Maybe now the complexity issue is solved?

Andrew Grimm · Accepted Answer · 2011-02-27 22:22:44Z

6

The following is a slightly more functional programming style:

array_with_lower_case_a = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
hash_grouped_by_name = array_with_lower_case_a.group_by {|name| name}
hash_grouped_by_name.map{|name, names| [name, names.length]}
=> [["Jason", 2], ["Teresa", 1], ["Judah", 3], ["Michelle", 1], ["Allison", 1]]

One advantage of group_by is that you can use it to group equivalent but not exactly identical items:

another_array_with_lower_case_a = ["Jason", "jason", "Teresa", "Judah", "Michelle", "Judah Ben-Hur", "JUDAH", "Allison"]
hash_grouped_by_first_name = another_array_with_lower_case_a.group_by {|name| name.split(" ").first.capitalize}
hash_grouped_by_first_name.map{|first_name, names| [first_name, names.length]}
=> [["Jason", 2], ["Teresa", 1], ["Judah", 3], ["Michelle", 1], ["Allison", 1]]

answered Feb 27, 2011 at 22:22

Andrew Grimm

82.3k58 gold badges210 silver badges355 bronze badges

1 Comment

tokland Over a year ago

Did I hear functional programming? +1 :-) This is definitely the best way, although it can be argued that is not memory-efficient. Notice also that Facets has a Enumerable#frequency.

Arup Rakshit · Accepted Answer · 2013-06-18 20:28:18Z

5

names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
Hash[names.group_by{|i| i }.map{|k,v| [k,v.size]}]
# => {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}

answered Jun 18, 2013 at 20:28

Arup Rakshit

119k30 gold badges270 silver badges328 bronze badges

Comments

narzero · Accepted Answer · 2014-08-14 12:02:28Z

5

a = [1, 2, 3, 2, 5, 6, 7, 5, 5]
a.each_with_object(Hash.new(0)) { |o, h| h[o] += 1 }

# => {1=>1, 2=>2, 3=>1, 5=>3, 6=>1, 7=>1}

Credit Frank Wambutt

answered Aug 14, 2014 at 12:02

narzero

2,3095 gold badges42 silver badges78 bronze badges

Comments

Sami Birnbaum · Accepted Answer · 2017-11-21 15:51:40Z

2

Lots of great implementations here.

But as a beginner I would consider this the easiest to read and implement

names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]

name_frequency_hash = {}

names.each do |name|
  count = names.count(name)
  name_frequency_hash[name] = count  
end
#=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}

The steps we took:

we created the hash
we looped over the names array
we counted how many times each name appeared in the names array
we created a key using the name and a value using the count

It may be slightly more verbose (and performance wise you will be doing some unnecessary work with overriding keys), but in my opinion easier to read and understand for what you want to achieve

answered Nov 21, 2017 at 15:51

Sami Birnbaum

86110 silver badges22 bronze badges

3 Comments

Tom Lord Over a year ago

I don't see how that's any easier to read than the accepted answer, and it's clearly a worse design (doing lots of unnecessary work).

Sami Birnbaum Over a year ago

@Tom Lord - I do agree with you on performance (I even mentioned that in my answer) - but as a beginner trying to understand the actual code and steps required, I find it helps to be more verbose and then one can refactor to improve performance and make code more declarative

Adamantish Over a year ago

I agree somewhat with @SamiBirnbaum. This is the only one that uses almost no special ruby knowledge like Hash.new(0). The closest to pseudocode. That can be a good thing for readability but also doing unnecessary work can harm readability for readers who notice it because in more complex cases they will spend a little time thinking they're going crazy trying to figure out why it's done.

Tomerikoo · Accepted Answer · 2021-12-03 11:34:25Z

2

With ruby 2.6 you can do:

names.to_h{ |name| [name, names.count(name)] }

gives you:

{"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}

edited Dec 3, 2021 at 11:34

Tomerikoo

19.6k16 gold badges57 silver badges68 bronze badges

answered Apr 25, 2021 at 13:38

Akash Kinwad

8152 gold badges8 silver badges25 bronze badges

Comments

Andrew Grimm · Accepted Answer · 2011-02-27 22:09:01Z

This is more a comment than an answer, but a comment wouldn't do it justice. If you do Array = foo, you crash at least one implementation of IRB:

C:\Documents and Settings\a.grimm>irb
irb(main):001:0> Array = nil
(irb):1: warning: already initialized constant Array
=> nil
C:/Ruby19/lib/ruby/site_ruby/1.9.1/rbreadline.rb:3177:in `rl_redisplay': undefined method `new' for nil:NilClass (NoMethodError)
        from C:/Ruby19/lib/ruby/site_ruby/1.9.1/rbreadline.rb:3873:in `readline_internal_setup'
        from C:/Ruby19/lib/ruby/site_ruby/1.9.1/rbreadline.rb:4704:in `readline_internal'
        from C:/Ruby19/lib/ruby/site_ruby/1.9.1/rbreadline.rb:4727:in `readline'
        from C:/Ruby19/lib/ruby/site_ruby/1.9.1/readline.rb:40:in `readline'
        from C:/Ruby19/lib/ruby/1.9.1/irb/input-method.rb:115:in `gets'
        from C:/Ruby19/lib/ruby/1.9.1/irb.rb:139:in `block (2 levels) in eval_input'
        from C:/Ruby19/lib/ruby/1.9.1/irb.rb:271:in `signal_status'
        from C:/Ruby19/lib/ruby/1.9.1/irb.rb:138:in `block in eval_input'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:189:in `call'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:189:in `buf_input'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:103:in `getc'
        from C:/Ruby19/lib/ruby/1.9.1/irb/slex.rb:205:in `match_io'
        from C:/Ruby19/lib/ruby/1.9.1/irb/slex.rb:75:in `match'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:287:in `token'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:263:in `lex'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:234:in `block (2 levels) in each_top_level_statement'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:230:in `loop'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:230:in `block in each_top_level_statement'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:229:in `catch'
        from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:229:in `each_top_level_statement'
        from C:/Ruby19/lib/ruby/1.9.1/irb.rb:153:in `eval_input'
        from C:/Ruby19/lib/ruby/1.9.1/irb.rb:70:in `block in start'
        from C:/Ruby19/lib/ruby/1.9.1/irb.rb:69:in `catch'
        from C:/Ruby19/lib/ruby/1.9.1/irb.rb:69:in `start'
        from C:/Ruby19/bin/irb:12:in `<main>'

C:\Documents and Settings\a.grimm>

That's because Array is a class.

jszobody · Accepted Answer · 2013-12-05 22:24:38Z

1

arr = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]

arr.uniq.inject({}) {|a, e| a.merge({e => arr.count(e)})}

Time elapsed 0.028 milliseconds

interestingly, stupidgeek's implementation benchmarked:

Time elapsed 0.041 milliseconds

and the winning answer:

Time elapsed 0.011 milliseconds

:)

edited Dec 5, 2013 at 22:24

jszobody

29k6 gold badges66 silver badges76 bronze badges

answered Dec 5, 2013 at 22:05

Alex Moore-Niemi

3,5022 gold badges28 silver badges26 bronze badges

Collectives™ on Stack Overflow

How to count identical string elements in a Ruby array

15 Answers 15

Ruby v2.7+ (latest)

Ruby v2.4+ (EOL ruby version, but still valid code on newer versions)

Ruby v2.2+ (EOL ruby version, but still valid code on newer versions)

4 Comments

4 Comments

Comments

3 Comments

1 Comment

4 Comments

1 Comment

2 Comments

1 Comment

Comments

Comments

3 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

15 Answers 15

Ruby v2.7+ (latest)

Ruby v2.4+ (EOL ruby version, but still valid code on newer versions)

Ruby v2.2+ (EOL ruby version, but still valid code on newer versions)

4 Comments

4 Comments

Comments

3 Comments

1 Comment

4 Comments

1 Comment

2 Comments

1 Comment

Comments

Comments

3 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related