0

I am using Netezza (based on PostgreSQL) and need to select all columns in a table for rows distinct on one column. A related question with answer can be found here, but it doesn't handle the case with all columns, going by that answer throws an error:

select distinct on (some_field) table1.* from table1 order by some_field;

Snippet from error with real data:

"(" (at char 77) expecting '')''

5
  • 1
    What error are you getting? The code might not do what you want but it shouldn't generate an error. Commented Jan 27, 2017 at 12:53
  • Can you add some sample table data and the expected result? (As well formatted text.) Commented Jan 27, 2017 at 12:53
  • How and where are you running that statement? Commented Jan 27, 2017 at 14:36
  • In fact I should have been clearer on this. I'm running this against Netezza which is based on PostgreSql Commented Jan 28, 2017 at 3:42
  • I removed the postgresql tag as Netezza is sufficiently different to Postgres. Commented Jan 28, 2017 at 14:33

2 Answers 2

3

I don't think your code should throw an error in Postgres. However, it won't do what you expect without an order by:

select distinct on (some_field) table1.*
from table1
order by some_field;
Sign up to request clarification or add additional context in comments.

4 Comments

That's misleading. ORDER BY is in now way necessary here.
@ErwinBrandstetter . . . Is that really true? I know it might seem to work without the ORDER BY, but when I read the documentation (postgresql.org/docs/9.6/static/sql-select.html#SQL-DISTINCT), I have always interpreted it that the ORDER BY is needed not only for ordering within a group but to define the groups. In particular, "The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). "
Yes, that's really true. Leading DISTINCT ON and ORDER BY expressions must match - if both clauses are added. Groups for the distinct operation are defined by DISTINCT ON items exclusively. ORDER BY only affects the sort order of the result - and which rows are picked per group if more expressions are added to influence the order within groups.
@ErwinBrandstetter . . . Thank you for the clarification. The documentation (which is generally very good for Postgres) could be clearer on this point -- by pointing out that ORDER BY is optional.
2

The syntax of your query is correct for Postgres (like you declared at first). See:

You later clarified you actually work with Netezza, which is only loosely related to Postgres. Wikipedia states:

Netezza is based on PostgreSQL 7.2, but does not maintain compatibility.

Netezza does not seem to support DISTINCT ON (), only DISTINCT.
It supports row_number(), though. So this works:

SELECT *
FROM  (
   SELECT *, row_number() OVER (PARTITION BY some_field) AS rn
   FROM   table1
   ) sub
WHERE  rn = 1;

If, from each set with identical some_field, any row is good, you are done here. Else, implement your priority with ORDER BY in the OVER clause.

Related:

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.