0

From a hash like the below one, need to extract the values per one parameter:

array_of_hashes = [{type: "test1", value: 1}, {type: "test1", value: 1}, {type: "test2", value: 1}, {type: "test2", value: 1}]

I would like to achive something like this:

array_of_hashes = [{"test1" => [{type: "test1", value: 1}, {type: "test1", value: 1}}], {"test2" => {type: "test2", value: 1}, {type: "test2", value: 1}}]

What is the most efficient way to transform this array? This array could have something like 2 000 000 values.

2 Answers 2

2

You can use group_by:

array_of_hashes.group_by { |h| h[:type] }
# => {"test1"=>[{:type=>"test1", :value=>1}, {:type=>"test1", :value=>1}], "test2"=>[{:type=>"test2", :value=>1}, {:type=>"test2", :value=>1}]} 
Sign up to request clarification or add additional context in comments.

Comments

0

If, as in the example, the hashes are ordered by the value of :type, you might find that Enumerable#chunk (then converted to a hash) is faster than group_by:

Hash[array_of_hashes.chunk { |h| h[:type] }.to_a]

I suggest you compare them. Edit: I tested. As shown below, group_by appears to be faster.

def mk_arr(n,m)
  n.times.with_object([]) { |i,a| m.times { a << { type: "test#{i}", value: 1 } } }
end

mk_arr(3,2)
  #=> [{:type=>"test0", :value=>1}, {:type=>"test0", :value=>1},
  #    {:type=>"test1", :value=>1}, {:type=>"test1", :value=>1},
  #    {:type=>"test2", :value=>1}, {:type=>"test2", :value=>1}] 

require 'benchmark'  
def bench_em(n,m)
  arr = mk_arr(n,m)
  puts "n = #{n}, m = #{m}"
  print "group_by: "
  puts Benchmark.measure { arr.group_by { |h| h[:type] } }     
  print "chunk   : "
  puts Benchmark.measure { Hash[arr.chunk { |h| h[:type] }.to_a] }
  puts
end

              user     system      total        real
n = 100000, m = 2
group_by:   0.090000   0.000000   0.090000 (  0.095301)
chunk   :   0.160000   0.000000   0.160000 (  0.166608)

n = 100000, m = 4
group_by:   0.250000   0.010000   0.260000 (  0.252951)
chunk   :   0.260000   0.000000   0.260000 (  0.261747)

n = 100000, m = 8
group_by:   0.370000   0.010000   0.380000 (  0.386423)
chunk   :   0.450000   0.000000   0.450000 (  0.447134)

n = 1000000, m = 4
group_by:   2.740000   0.040000   2.780000 (  2.775030)
chunk   :   3.690000   0.070000   3.760000 (  3.790305)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.