If, as in the example, the hashes are ordered by the value of :type, you might find that Enumerable#chunk (then converted to a hash) is faster than group_by:
Hash[array_of_hashes.chunk { |h| h[:type] }.to_a]
I suggest you compare them. Edit: I tested. As shown below, group_by appears to be faster.
def mk_arr(n,m)
n.times.with_object([]) { |i,a| m.times { a << { type: "test#{i}", value: 1 } } }
end
mk_arr(3,2)
#=> [{:type=>"test0", :value=>1}, {:type=>"test0", :value=>1},
# {:type=>"test1", :value=>1}, {:type=>"test1", :value=>1},
# {:type=>"test2", :value=>1}, {:type=>"test2", :value=>1}]
require 'benchmark'
def bench_em(n,m)
arr = mk_arr(n,m)
puts "n = #{n}, m = #{m}"
print "group_by: "
puts Benchmark.measure { arr.group_by { |h| h[:type] } }
print "chunk : "
puts Benchmark.measure { Hash[arr.chunk { |h| h[:type] }.to_a] }
puts
end
user system total real
n = 100000, m = 2
group_by: 0.090000 0.000000 0.090000 ( 0.095301)
chunk : 0.160000 0.000000 0.160000 ( 0.166608)
n = 100000, m = 4
group_by: 0.250000 0.010000 0.260000 ( 0.252951)
chunk : 0.260000 0.000000 0.260000 ( 0.261747)
n = 100000, m = 8
group_by: 0.370000 0.010000 0.380000 ( 0.386423)
chunk : 0.450000 0.000000 0.450000 ( 0.447134)
n = 1000000, m = 4
group_by: 2.740000 0.040000 2.780000 ( 2.775030)
chunk : 3.690000 0.070000 3.760000 ( 3.790305)