1

Postgres version: 10

Table example:

CREATE TABLE log (
    group_id INTEGER,
    log_begin TIMESTAMP,
    log_end TIMESTAMP
);

My goal: I want to know, for distinct groups, which log began right after the current log ends for each row or NULL if does not exists. Example: if the log of row 1 ends at 2022-07-15 15:30:00, the next log begins at 2022-07-15 16:00:00, so 2022-07-15 16:00:00 is the answer. If the log of row 4 ends at 2022-07-15 15:20:00, the next log begins at 2022-07-15 15:30:00, so it's the answer

Example data:

group_id log_begin log_end
1 2022-07-15 15:00:00 2022-07-15 15:30:00
1 2022-07-15 16:00:00 2022-07-15 16:30:00
1 2022-07-15 17:00:00 2022-07-15 17:30:00
2 2022-07-15 15:00:00 2022-07-15 15:20:00
2 2022-07-15 15:15:00 2022-07-15 15:40:00
2 2022-07-15 15:30:00 2022-07-15 16:30:00
  • My first solution was use a sub-query and search the next value for every row, but this table is very big, so the query result is correct, but it's very slow. Something like this:
SELECT *, ( SELECT _L.log_begin FROM log _L 
    WHERE _L.log_begin > L.log_end 
        AND _L.group_id = L.group_id 
    ORDER BY _L.log_begin ASC LIMIT 1 ) AS next_log_begin
FROM log L
  • My second solution was use a window function like LEAD as above
SELECT *, LEAD( log_begin, 1 ) OVER ( PARTITION BY group_id ORDER BY log_begin ) AS next_log_begin
FROM log

but the result isn't correct:

group_id log_begin log_end next_log_begin
1 2022-07-15 15:00:00 2022-07-15 15:30:00 2022-07-15 16:00:00
1 2022-07-15 16:00:00 2022-07-15 16:30:00 2022-07-15 17:00:00
1 2022-07-15 17:00:00 2022-07-15 17:30:00 NULL
2 2022-07-15 15:00:00 2022-07-15 15:20:00 2022-07-15 15:15:00
2 2022-07-15 15:15:00 2022-07-15 15:40:00 2022-07-15 15:30:00
2 2022-07-15 15:30:00 2022-07-15 16:30:00 NULL

Because in row 4 it should get 2022-07-15 15:30:00 instead and row 5 should be NULL.

Correct output:

group_id log_begin log_end next_log_begin
1 2022-07-15 15:00:00 2022-07-15 15:30:00 2022-07-15 16:00:00
1 2022-07-15 16:00:00 2022-07-15 16:30:00 2022-07-15 17:00:00
1 2022-07-15 17:00:00 2022-07-15 17:30:00 NULL
2 2022-07-15 15:00:00 2022-07-15 15:20:00 2022-07-15 15:30:00
2 2022-07-15 15:15:00 2022-07-15 15:40:00 NULL
2 2022-07-15 15:30:00 2022-07-15 16:30:00 NULL

Is there any way to do that using Postgres 10? Window function are preferable but not a required resource

4
  • It's a bit unclear what your expected result is. Could you write it out like you did your actual results? Commented Jul 15, 2022 at 20:31
  • I do not see how you can do this with a window function because in group 2, the first interval overlaps the second and the second overlaps the third. Please see dbfiddle.uk/… and change out the lines in the insert statement to see what I mean. The self-join you did is probably necessary to handle this condition. Commented Jul 15, 2022 at 22:07
  • @Schwern sorry, I've edited the question and add a example of a query that get the right result Commented Jul 18, 2022 at 12:23
  • @MikeOrganek I've think that there's a solution like range between INTERVAL '1 SECOND' FOLLOWING and UNBOUNDED FOLLOWING of postgres 11+, but it seen that it doesn't work too. I'd like to avoid this self-join; if it's necessary I think a "pre processing" strategy to save this data in insertion time is a better solution, but it'll give me A LOT of work to this here Commented Jul 18, 2022 at 12:43

1 Answer 1

0

The data and the results you expect to see don't appear to line up with the logic you've outlined, but I think I get what you are saying.

If I understand you correctly, you want to look at the "next log begin" for every record, sorted by group then log start. If this is the case, you want to omit the "partition by" because it will yield a null any time the group id changes. It executes the lead within groups of whatever value(s) you specify in partition by, in this case group_id. So, for starters:

select
  group_id, log_begin, log_end,
  lead (log_begin) over (order by group_id, log_begin) as x
from log

Which looks for the next record, independent of changes to the group.

There is no way I'm aware of to evaluate the result of a window function within the expression that invokes it, so to do this you essentially would need to wrap it in a CTE and then evaluate it:

with cte as (
  select
    group_id, log_begin, log_end,
    lead (log_begin) over (order by group_id, log_begin) as x
  from log
)
select
  group_id, log_begin, log_end,
  x
from cte

And now you can compare x to any other field. I think the new field you want would look like this:

case
  when log_end < x then x
end as next_log_begin

But again, it does not match your desired results. So either I misunderstood, your sample data might be off, or your assumptions might be off. All are equally possible.

Full query example:

with cte as (
  select
    group_id, log_begin, log_end,
    lead (log_begin) over (order by group_id, log_begin) as x
  from log
)
select
  group_id, log_begin, log_end,
  x,
  case
    when log_end < x then x
  end as next_log_begin
from cte

-- EDIT 7/18/2022 --

I think I see now based on your revised question. I can't promise this will be efficient, but if you implement a scalar I think it will do what you think. Try this and let me know.

select
  group_id, log_begin, log_end,
  (select min (log_begin)
  from log l2
  where l1.group_id = l2.group_id
  and l2.log_begin > l1.log_end) as next_log_begin
from log l1
order by group_id, log_begin
Sign up to request clarification or add additional context in comments.

4 Comments

Unfortunately it fail to the case that exists a "overlap" and next row log begins before the current log ends. Example: line 4 ends at 2022-07-15 15:20:00 and line 5 begins at 2022-07-15 15:15:00, so the right answer is the log of line 6 the begins at 2022-07-15 15:30:00
Do you think you can update your question with the exact specific output you desire?
done, the 3th table is the correct output
I believe I understand... second attempt and comments posted

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.