Postgres version: 10
Table example:
CREATE TABLE log (
group_id INTEGER,
log_begin TIMESTAMP,
log_end TIMESTAMP
);
My goal: I want to know, for distinct groups, which log began right after the current log ends for each row or NULL if does not exists. Example: if the log of row 1 ends at 2022-07-15 15:30:00, the next log begins at 2022-07-15 16:00:00, so 2022-07-15 16:00:00 is the answer. If the log of row 4 ends at 2022-07-15 15:20:00, the next log begins at 2022-07-15 15:30:00, so it's the answer
Example data:
| group_id | log_begin | log_end |
|---|---|---|
| 1 | 2022-07-15 15:00:00 | 2022-07-15 15:30:00 |
| 1 | 2022-07-15 16:00:00 | 2022-07-15 16:30:00 |
| 1 | 2022-07-15 17:00:00 | 2022-07-15 17:30:00 |
| 2 | 2022-07-15 15:00:00 | 2022-07-15 15:20:00 |
| 2 | 2022-07-15 15:15:00 | 2022-07-15 15:40:00 |
| 2 | 2022-07-15 15:30:00 | 2022-07-15 16:30:00 |
- My first solution was use a sub-query and search the next value for every row, but this table is very big, so the query result is correct, but it's very slow. Something like this:
SELECT *, ( SELECT _L.log_begin FROM log _L
WHERE _L.log_begin > L.log_end
AND _L.group_id = L.group_id
ORDER BY _L.log_begin ASC LIMIT 1 ) AS next_log_begin
FROM log L
- My second solution was use a window function like LEAD as above
SELECT *, LEAD( log_begin, 1 ) OVER ( PARTITION BY group_id ORDER BY log_begin ) AS next_log_begin
FROM log
but the result isn't correct:
| group_id | log_begin | log_end | next_log_begin |
|---|---|---|---|
| 1 | 2022-07-15 15:00:00 | 2022-07-15 15:30:00 | 2022-07-15 16:00:00 |
| 1 | 2022-07-15 16:00:00 | 2022-07-15 16:30:00 | 2022-07-15 17:00:00 |
| 1 | 2022-07-15 17:00:00 | 2022-07-15 17:30:00 | NULL |
| 2 | 2022-07-15 15:00:00 | 2022-07-15 15:20:00 | 2022-07-15 15:15:00 |
| 2 | 2022-07-15 15:15:00 | 2022-07-15 15:40:00 | 2022-07-15 15:30:00 |
| 2 | 2022-07-15 15:30:00 | 2022-07-15 16:30:00 | NULL |
Because in row 4 it should get 2022-07-15 15:30:00 instead and row 5 should be NULL.
Correct output:
| group_id | log_begin | log_end | next_log_begin |
|---|---|---|---|
| 1 | 2022-07-15 15:00:00 | 2022-07-15 15:30:00 | 2022-07-15 16:00:00 |
| 1 | 2022-07-15 16:00:00 | 2022-07-15 16:30:00 | 2022-07-15 17:00:00 |
| 1 | 2022-07-15 17:00:00 | 2022-07-15 17:30:00 | NULL |
| 2 | 2022-07-15 15:00:00 | 2022-07-15 15:20:00 | 2022-07-15 15:30:00 |
| 2 | 2022-07-15 15:15:00 | 2022-07-15 15:40:00 | NULL |
| 2 | 2022-07-15 15:30:00 | 2022-07-15 16:30:00 | NULL |
Is there any way to do that using Postgres 10? Window function are preferable but not a required resource
range between INTERVAL '1 SECOND' FOLLOWING and UNBOUNDED FOLLOWINGof postgres 11+, but it seen that it doesn't work too. I'd like to avoid this self-join; if it's necessary I think a "pre processing" strategy to save this data in insertion time is a better solution, but it'll give me A LOT of work to this here