2

I've been playing around with PostgreSQL's intarray module and I've noticed some seemingly odd behavior with intarrays that have duplicate elements in them. The intarray module defines two subtraction operators with pretty much the same description:

integer[] - integer → integer[]
Removes entries matching the right argument from the array.

integer[] - integer[] → integer[]
Removes elements of the right array from the left array.

To me, the wording above seems to imply that A[] - B[] is equivalent to FOR int i IN B[]: A[] -= i

Removing a single integer works as you'd expect, preserving duplicates:

SELECT '{3,1,1,2,2,2}'::int[] - 1 -->  {3,2,2,2}   (As expected)

Whereas removing an array removes all duplicates regardless:

SELECT '{3,1,1,2,2,2}'::int[] - '{1}'::int[] -->  {2,3}   (Where's my extra twos???)

As of PG16, the intarray documentation doesn't seem to mention anything about the automatic removal of ALL duplicated entries when computing the difference between two integer arrays (suggesting that it uses set operations under the hood - which is fine, but shouldn't it say that?). So...

Is this the intended behavior when you subtract one integer array from another in PostgreSQL?

*For what I was doing, there was useful information stored in the number of duplicated elements in each array. While I was able to find a workaround for my application, it was still a surprise and took me a few minutes to figure out what was happening...

If it helps, I am running PostgreSQL 16.2 on Debian 12 (Specifically the official PostgreSQL Docker build of: PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit)

psql result for above select statement

DBFiddle for above example

EDIT; It seems this is the (set operation) behavior is correct and the PostgreSQL team have proposed an update to the documentation

7
  • 1
    Good point. You might want to fill the form at the bottom of the documentation page to suggest the correction. As to whether this was intended or not, maybe there are comments in the source code that could indicate it. My expectation would also be that it should not sort and deduplicate as a silent bonus effect of either subtraction, especially since additions do not have that effect. Commented Mar 17, 2024 at 3:57
  • It is there in 9.3-16.0: demo Commented Mar 17, 2024 at 9:17
  • 1
    I've sent the Postgres team a message, if I get a response, I'll add it to the post Commented Mar 18, 2024 at 8:22
  • 1
    Actually, the set in the name of the int[] - int[] operator intset_subtract hints that this behaviour is in fact intended. The + is called intarray_push_array, which explains why it doesn't turn the array it into a set. Commented Mar 18, 2024 at 18:45
  • 1
    @Zegarek My fault. I do have a anyarray - anyarray operator, defined a long time ago and long forgotten. Sorry for the confusion. Commented Mar 18, 2024 at 20:26

2 Answers 2

0

Yes, that is the intended behavior. At least, it has been like that since the introduction of the intarray extension. But I agree that some additional documentation wouldn't harm.

Sign up to request clarification or add additional context in comments.

Comments

0

In case that you often need a 'plain behaviour' array subtraction operator then you may define a custom one:

create or replace function array_diff(anyarray, anyarray)
returns anyarray language sql immutable as
$$
  select array_agg(item order by ordinality)
  from unnest($1) with ordinality item
  where not item = any($2);
$$;

create operator #- 
(
 function = array_diff,
 leftarg = anyarray,
 rightarg = anyarray
);

4 Comments

This explains the confusion in the comments on the question, but it doesn't appear to answer the actual question
@Bergi Yes, this was my intent - to try explain the confusion.
I think the last comment was sufficient for that. You could also delete all the other comments. Either way, this should not be posted as an answer.
I think it’s salvageable if edited to clearly propose defining one’s own custom operator that doesn’t do the surprise sort+uniq, assuming OP isn’t already doing exactly that as the workaround they mentioned (I was able to find a workaround for my application) and that they are interested in solving X while asking about Y.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.