1

I want to compare two arrays, ignoring the order of the elements and duplicates.

I could do a containment check in both directions, but are there better/faster solutions?

select * from my_table where my_table.a1 @> my_table.a2 and
                             my_table.a2 @> my_table.a1
1
  • 1
    If it is one dimensional array of integers with no nulls inside then you can use intarray to sort its contents, remove duplicates and then compare. I don't know if this will be faster tho. Alternative way is to unnest and work with set, this could be faster or slower, depending on your data, conditions and indexes. Commented Apr 13, 2017 at 7:47

2 Answers 2

3

From my internal testing, it seems that these two variants are the fastest (even faster than the @> + <@ check), and they both can handle NULLs too:

where (select array_agg(distinct e order by e) from unnest(arr1) e)
    = (select array_agg(distinct e order by e) from unnest(arr2) e)

where array(select distinct e from unnest(arr1) e order by e)
    = array(select distinct e from unnest(arr2) e order by e)

The latter is usually a little faster, but sometimes not. This can depend on a lot of things, f.ex. the cardinality of the arrays, etc.

Sign up to request clarification or add additional context in comments.

Comments

1

using unnest and distinct example:

t=# create or replace function so62(a1 anyarray,a2 anyarray) returns boolean as
$$
declare
 _r boolean := false;
 _a text;
begin
  with p as (select distinct unnest(a1) order by 1) select array_agg(unnest) into _a from p;
  with p as (select distinct unnest(a2) order by 1) select array_agg(unnest)::text = _a into _r from p;
  return _r;
end;
$$ language plpgsql;
CREATE FUNCTION

(function updated following @poz notice that it would require two except for comparison)

test:

t=# with c as (
  select '{1,null,2,2}'::int[] a1,'{2,1,null}'::int[] a2
) 
select a1 @> a2, a2 @> a1,so62(a1,a2) 
from c;
 ?column? | ?column? | so62
----------+----------+------
 f        | f        | t
(1 row)

in this example @> and <@ do not work at all. Also read Compare arrays for equality, ignoring order of elements

3 Comments

You'll also need to do the except both ways. You're technically right, if the array can contain NULL values, this is one of the few possible way to do it, but if it does not contain any NULLs, this can also be a huge performance killer.
yes, it is - but I assumed nulls from the question, my logic: if not nulls not mentioned, then there can be nulls
@pozs thanks for your notice! to avoid except twice casting to text

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.