2

Looking for a more efficient way of filtering an array of arrays using another array in Ruby. Let me demonstrate. Starting with this:

core = [[1, "apple", "James Bond"],
        [5, "orange", "Thor"],
        [10, "banana", "Wolverine"],
        [15, "orange", "Batman"],
        [20, "apple", "Mickey Mouse"],
        [25, "orange", "Lee Adama"],
        [30, "banana", "Luke Skywalker"]]

filter = ["apple", "banana"]

result = core.magical_function(filter)

# result == [[5, "orange", "Thor"],
#            [15, "orange", "Batman"],
#            [25, "orange", "Lee Adama"]]

The only thing I can think of is looping through the filter elements, but this slows down my code a lot when this toy example gets more complicated.

4
  • The most efficient way to filter an array of arrays is to not start with an array of arrays to begin with. A hash with your second column as keys would be the fastest way to process it. Can you construct such a hash instead? Commented Jul 25, 2017 at 16:38
  • What was your inefficient way? Commented Jul 25, 2017 at 16:47
  • @MarkThomas It would probably be more annoying to do it that way. This is a simplified example to demonstrate what I need. Converting to a hash would be really tough. Commented Jul 25, 2017 at 16:51
  • @sagarpandya82 I looped through each element of the filter criteria and rejected the array element if core[x][1] == i Commented Jul 25, 2017 at 16:51

3 Answers 3

7

Use Array#reject and Array#include?:

core.reject { |_,fruit,_| filter.include?(fruit) }
  # => [[5, "orange", "Thor"], [15, "orange", "Batman"], [25, "orange", "Lee Adama"]]

If filter is large, first convert it to a set for faster lookups:

require 'set'
filter_set = filter.to_set
core.reject { |_,fruit,_| filter_set.include?(fruit) }

See Set and its instance methods. When required, set adds the instance method to_set to the module Enumerable which is inclueded by Array. Ruby implements sets with unseen hashes.

Sign up to request clarification or add additional context in comments.

1 Comment

i always forget about set :(
3

You can change the filter array to a hash for faster look up. Then the whole thing requires just a single iteration through the core array:

filters = ["apple", "banana"].each_with_object({}) do |term, obj|
  obj[term] = true
end
# filters == {"apple" => true, "banana" => true}

filtered = core.reject do |array|
  filters[array[1]]
end

Comments

0

Why don't you use the intersection operator & ?

core.reject { |k| (k & filter).empty? } 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.