0

I have data that looks like the table below.

   date       value symbol  time        id
0 2014-01-01      0      A 2014-01-01   65
1 2014-01-02      1      B 2014-01-02   66
2 2014-01-03      2      A 2014-01-03   65
3 2014-01-04      3      B 2014-01-04   66
4 2014-01-05      5      A 2014-01-05   65
5 2014-01-05      4      A 2014-01-05   65
6 2014-01-06      6      B 2014-01-06   66

I'm trying to write an expression that will do a sort on symbol and value, and then a distinct on date and id. From what I understand, this isn't possible, since the distinct on needs to match what I order by.

What I want is to get the above table, but without row 4, since it will sort on symbol and value ascending, and then distinct and choose the first row in that sorted table. Is there a way to achieve this?

1 Answer 1

1

DISTINCT ON eliminates duplicates by selecting the first row as defined by the rest of the fields present in the ORDER BY clause:

SELECT DISTINCT ON ("date", "id") 
       "pk", "date", "value", "symbol", "time", "id"
FROM mytable
ORDER BY "date", "id", "value", "symbol"

The above query will choose for every "date", "id" slice the first row as defined by order of "value", "symbol" within that slice, hence it will exclude row with "pk" = 4.

Demo here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.