2

I have an array of hashes like

@data_records = [
                 {"user": "user1", "key1": "v1k1", ... , "keyN": "v1kN"},
                 {"user": "user2", "key1": "v2k1", ... , "keyN": "v2kN"},
                 {"user": "user3", "key1": "v3k1", ... , "keyN": "v3kN"},
                 {"user": "user1", "key1": "v4k1", ... , "keyN": "v4kN"},
                 {"user": "user1", "key1": "v5k1", ... , "keyN": "v5kN"},
                 {"user": "user4", "key1": "v6k1", ... , "keyN": "v6kN"},
                ]

As you may see, I may have many 'records' for the same user. In the example above, user1 has three records, for instance.

Now I need, based on this array of hashes, to generate an array with a single entry for every user in it. I mean, I need

[ "user1", "user2", "user3", "user4" ]

but not

[ "user1", "user2", "user3", "user1", "user1", "user4" ].

I wrote the following piece of code, which does the job:

def users_array
  arr = Array.new
  @data_records.each { |item| arr.push(item["user"]) if not arr.include?(item["user"])}
  arr
end

But it bothers me the fact I must use the auxiliary variable arr for this to work. I'm sure there is a shorter way to to this with Array#map method. Since Array#map returns an array, it could be something like

def users_array
  @data_records.map { |item| item["user"] if ... }
end

The problem is I don't know how to mention the array I'm creating with Array#map inside the block. I believe it could be something like

def users_array
  @data_records.map { |item| item["user"] if not this.include?(item["user"]) }
end

but it doesn't work, of course.

Can someone tell if there is a way to do this?

EDIT

Yes, I could use Array#uniq to do this. But then I rephrase the question: Is there a way to refer to the implicit array created by map inside the map's block?

4
  • 4
    why not just @data_records.map { |item| item["user"] }.uniq? Commented Aug 29, 2016 at 23:39
  • 1
    yep @data_records.map { |e| e[:user] }.uniq! Commented Aug 29, 2016 at 23:42
  • Okay, thanks for both. But I must then rephrase my question. Is there a way to refer the implicit array being created by a map inside its block? Commented Aug 29, 2016 at 23:47
  • 2
    not with map, no; but you could do so with reduce or inject: @data_records.inject([]) { |memo, item| memo << item["user"] unless memo.include? item["user"]; memo } Commented Aug 30, 2016 at 0:13

3 Answers 3

3

For me the best way to do this is each_with_object, and use a Set instead of an array to collect the user names.

require 'set'

def users_array
  @data_records.each_with_object(Set.new) do |item, set|
    set << item[:user]
  end
end
Sign up to request clarification or add additional context in comments.

3 Comments

The hash pairs are defined like "user": "user1", so "user" will actually become a symbol :user rather than a string 'user'.
@mwp You are absolutely right. I'll correct my answer.
It was a bug in the OP, so you can hardly be blamed. :)
2

Aetherus is the closest to answering your rephrased question and he should get all the credit for pointing out #each_with_object to get at the "implicit array." But here's something a little closer to what you're asking:

@data_records.each_with_object([]) do |item, this|
  this << item[:user] unless this.include?(item[:user])
end

I think using a Set:

Set.new(@data_records.map { |item| item[:user] })

or #uniq:

@data_records.map { |item| item[:user] }.uniq

will probably be faster and scale to a large number of items better, but I haven't benchmarked it.

7 Comments

Using an array is part of my requisites, since this method will give its result to another method waiting for an array. But, as you pointed, he deserves the credit for #each_with_object. Then I upvoted both answers, but will pick yours as the best answer, for it is closer to what I need.
@EddeAlmeida Most of the time when you say a method is waiting for an array, it is actually waiting for an Enumerable unless that method access the elements by index. Both arrays and sets are Enumerables.
Maybe @data_records.map { |item| item[:user] }.tap(&:uniq!) is better because it saves some memory :) BTW, this is my favorite usage of tap.
@Aetherus I love that! I frequently want to use the modify-in-place operators but sometimes can't because they might return nil. This is a great technique that I am definitely going to steal. Thanks!
@Aethenrus, why .tap(&:uniq!) rather than .tap(&:uniq)? Why not just .uniq? Also, uniq! returns nil if no change was made.
|
1

Edit: I fear I may have misunderstood the question.

I will leave my original answer (below), should it be of interest to anyone.

def combine(data, key)
  data.each_with_object({}) do |g,h|
    f = g.each_with_object({}) { |(k,v),f| f[k] = (k==:user ? v : [v]) }
    h.update(f[:user]=>f) do |k,o,n|
      o.merge(n) { |kk,nn,oo| kk==:user ? nn : nn+oo } 
    end
  end.values
end 

data_records = [
  {user: "user1", key1: "v1k1", keyN: "v1kN"},
  {user: "user2", key1: "v2k1", keyN: "v2kN"},
  {user: "user3", key1: "v3k1", keyN: "v3kN"},
  {user: "user1", key1: "v4k1", keyN: "v4kN"},
  {user: "user1", key1: "v5k1", keyN: "v5kN"},
  {user: "user4", key1: "v6k1", keyN: "v6kN"},
]

combine(data_records, :user)
  #=> [{:user=>"user1", :key1=>["v1k1", "v4k1", "v5k1"],
  #     :keyN=>["v1kN", "v4kN", "v5kN"]},
  #    {:user=>"user2", :key1=>["v2k1"], :keyN=>["v2kN"]},
  #    {:user=>"user3", :key1=>["v3k1"], :keyN=>["v3kN"]},
  #    {:user=>"user4", :key1=>["v6k1"], :keyN=>["v6kN"]}] 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.