PostgreSQL Window Function ordering

Question

I've got this table (PostgreSQL 9.3):

x | y  
- | -  
1 | 1  
2 | 2  
2 | 3  
1 | 4

Now I try to get three partitions out of it: Every time the value x is changing (by ordering y), a new dense_rank value should be given. Now I tried the following:

SELECT x, y, dense_rank() over (partition by x order by y) 
FROM table

But with the partition over x the ordering is not working the way I expect. The result is

x  y  dense_rank  
-  -  ----------  
1  1  1     
1  4  2     
2  2  1     
2  3  2

instead of the expected:

x  y  dense_rank  
-  -  ----------  
1  1  1     
2  2  2     
2  3  2     
1  4  3

Now I am not sure why the window is not ordered by y.

In the second step I need this rank for a grouping (GROUP BY dense_rank, x). So in the end I need the following result:

x  y      dense_rank  
-  -      ----------  
1  1      1     
2  {2,3}  2         
1  4      3

Maybe this could be achieved in an easier way?

When you partition by x and order by y within window function Postgres has to order it by x,y so this is why your output is ordered like it is (x=1, y=4 comes before x=2, y=2) — Kamil Gosciminski
– Kamil Gosciminski, Commented Feb 9, 2016 at 16:50
You may be interested in this answer to, "Solving “Gaps and Islands” with row_number() and dense_rank()? — Evan Carroll
– Evan Carroll, Commented Mar 14, 2017 at 19:00

Community · Accepted Answer · 2020-06-20 09:12:55Z

4

partition over x the ordering is not working the way I expect

It is working perfectly fine. When you partition by x first 1 and last 1 are in the same group.

Window Functions:

The PARTITION BY list within OVER specifies dividing the rows into groups, or partitions, that share the same values of the PARTITION BY expression(s). For each row, the window function is computed across the rows that fall into the same partition as the current row.

To get result you want you could use (classic example of gaps and islands problem):

SELECT *, ROW_NUMBER() OVER (ORDER BY y) -
          ROW_NUMBER() OVER (PARTITION BY x ORDER BY y) + 1 AS group_id
FROM tab
ORDER BY group_id

LiveDemo

Output:

╔═══╦═══╦══════════╗
║ x ║ y ║ group_id ║
╠═══╬═══╬══════════╣
║ 1 ║ 1 ║        1 ║
║ 2 ║ 2 ║        2 ║
║ 2 ║ 3 ║        2 ║
║ 1 ║ 4 ║        3 ║
╚═══╩═══╩══════════╝

Warning:
This solution is not general.

EDIT:

More general solution is to utilize LAG to get previous value and windowed SUM:

WITH cte AS
(
  SELECT t1.x, t1.y, LAG(x) OVER(ORDER BY y) AS x_prev
  FROM tab t1
)
SELECT x,y, SUM( CASE WHEN x = COALESCE(x_prev,x) THEN 0 ELSE 1 END) 
            OVER(ORDER BY y) + 1 AS group_id
FROM cte
ORDER BY group_id;

LiveDemo2

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Feb 9, 2016 at 16:39

Lukasz Szozda

181k26 gold badges278 silver badges326 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

S-Man Over a year ago

Wow. Ok, it is not the way I expected. Is there no way for rank() instead of two row_number() calls. I understood the rank-function that it counts the different partitions. But I found my error: I thought there is first an ordering an then the partitioning. But it is only the ordering within the partition of course. That makes sense now. So thank you for understanding. Is there a way to make partitions after the ordering? So if I order by y first and then I can do the partitions by x...

S-Man Over a year ago

Hi, i thought about your solution for a while. I was wondering why it works. It does for my special case because I will group the result on x and group_id. But it is not a general solution for the partitioning problem in my option. If I expand my example, the group_id is not changing if x is changing: link. The 5th data set contains: x = 2. So it differs from the 4th. In my case it has to be a new group_id. But it has the same as the 4th line (order by y). Is there a general solution?

Evan Carroll Over a year ago

For another explanation, you may be interested in this answer to, "Solving “Gaps and Islands” with row_number() and dense_rank()?

Collectives™ on Stack Overflow

PostgreSQL Window Function ordering

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related