3

I have the table

CREATE TABLE t1
(
  id           serial           NOT NULL,
  in_quantity  bigint           NULL,
  price        money            NOT NULL,
  out_quantity bigint           NULL,
  stamp        timestamp        NOT NULL
);

with such data, for example (date is the same, but not the time)

INSERT INTO t1 (in_quantity, price, out_quantity, stamp)
VALUES
( 100, 10.00, NULL, '2014-10-20 00:00:00'), -- id =  1
( 200, 11.00, NULL, '2014-10-20 00:01:00'), -- id =  2
( 300, 12.00, NULL, '2014-10-20 00:02:00'), -- id =  3
(NULL, 13.00,  400, '2014-10-20 00:03:00'), -- id =  4
(NULL, 14.00,  500, '2014-10-20 00:04:00'), -- id =  5

( 600, 15.00, NULL, '2014-10-20 00:15:00'), -- id =  6
( 700, 16.00, NULL, '2014-10-20 00:16:00'), -- id =  7
( 800, 17.00, NULL, '2014-10-20 00:17:00'), -- id =  8
(NULL, 18.00,  900, '2014-10-20 00:18:00'), -- id =  9
(NULL, 19.00, 1000, '2014-10-20 00:19:00'), -- id = 10

(2300, 23.00, NULL, '2014-10-20 00:23:00'), -- id = 11
(2400, 24.00, NULL, '2014-10-20 00:24:00'); -- id = 12

I need to fetch lines from this table with maximum in and out quantities for each date range in a particular set. Set for example:

( "2014-10-20 00:00:00" : "2014-10-20 00:05:00" ]
( "2014-10-20 00:05:00" : "2014-10-20 00:10:00" ]
( "2014-10-20 00:10:00" : "2014-10-20 00:15:00" ]
( "2014-10-20 00:15:00" : "2014-10-20 00:20:00" ]
( "2014-10-20 00:20:00" : "2014-10-20 00:25:00" ]

and my desired result with this example would be

interval begin        | interval end          | max_in_q | max_in_q_id | max_out_q | max_out_q_id 
======================+=======================+==========+=============+===========+=============
"2014-10-20 00:00:00" | "2014-10-20 00:05:00" | 300      | 3           | 400       | 4
"2014-10-20 00:05:00" | "2014-10-20 00:10:00" | NULL     | NULL        | NULL      | NULL        
"2014-10-20 00:10:00" | "2014-10-20 00:15:00" | NULL     | NULL        | NULL      | NULL        
"2014-10-20 00:15:00" | "2014-10-20 00:20:00" | 800      | 8           | 1000      | 10
"2014-10-20 00:20:00" | "2014-10-20 00:25:00" | 2400     | 12          | NULL      | NULL

So. I can generate set like that with a query like this

SELECT
   i::timestamp AS dleft,
   i::timestamp + '1 hour' AS dright 
FROM
   generate_series('2014-10-20 00:00:00'::timestamp, '2014-10-20 23:00:00'::timestamp, '1 hour') AS i

But I cant figure how can I make aggregate function run for every one of this little ranges and how can I join results.

2 Answers 2

2

First, you need to realize, you need your ids too for each aggregated value, which isn't an easy query, in any RDBMS.

This problem mainly solved with DISTINCT ON in PostgreSQL:

SELECT DISTINCT ON (s)
  s ts_start, s + '5 minutes' ts_end, in_quantity max_in_q, id max_in_id
FROM
  generate_series('2014-10-20 00:00:00'::timestamp, '2014-10-20 00:20:00'::timestamp, '5 minutes') s
LEFT JOIN
  t1 ON stamp <@ tsrange(s, s + '5 minutes', '(]')
ORDER BY
  s, in_quantity DESC NULLS LAST;

But this only allows you, to select one max/min value, and the whole row, which they belongs to.

If you really need both max column, you need to write self-joins and sub-queries, which won't be so fast:

SELECT
  lower(r) ts_start, upper(r) ts_end, max_in_q, max_in.id max_in_id, max_out_q, max_out.id max_out_id
FROM (
  SELECT
    r, max(in_quantity) max_in_q, max(out_quantity) max_out_q
  FROM
    generate_series('2014-10-20 00:00:00'::timestamp, '2014-10-20 00:20:00'::timestamp, '5 minutes') s,
    tsrange(s, s + '5 minutes', '(]') r
  LEFT JOIN
    t1 ON stamp <@ r
  GROUP BY
    r
  ORDER BY
    r
) m
LEFT JOIN
  t1 max_in ON max_in.in_quantity = max_in_q
LEFT JOIN
  t1 max_out ON max_out.out_quantity = max_out_q;

Note: with this second version, you need to deal with duplicates yourself, because in_quantity and out_quantity isn't unique.

SQLFiddle

Sign up to request clarification or add additional context in comments.

Comments

1

I think this could be quite straightforward with the help of range type:

WITH data(in_quantity,price,out_quantity,stamp) AS (VALUES
( 100::int8, 10.00, NULL::int8, '2014-10-20 00:00:00'::timestamp), -- id =  1
( 200, 11.00, NULL, '2014-10-20 00:01:00'), -- id =  2
( 300, 12.00, NULL, '2014-10-20 00:02:00'), -- id =  3
(NULL, 13.00,  400, '2014-10-20 00:03:00'), -- id =  4
(NULL, 14.00,  500, '2014-10-20 00:04:00'), -- id =  5

( 600, 15.00, NULL, '2014-10-20 00:15:00'), -- id =  6
( 700, 16.00, NULL, '2014-10-20 00:16:00'), -- id =  7
( 800, 17.00, NULL, '2014-10-20 00:17:00'), -- id =  8
(NULL, 18.00,  900, '2014-10-20 00:18:00'), -- id =  9
(NULL, 19.00, 1000, '2014-10-20 00:19:00'), -- id = 10

(2300, 23.00, NULL, '2014-10-20 00:23:00'), -- id = 11
(2400, 24.00, NULL, '2014-10-20 00:24:00')
)
SELECT
   tsrange(i,i+INTERVAL '1h','[)') r,
   max(in_quantity)                max_in_q,
   max(out_quantity)               max_out_q
  FROM generate_series('2014-10-20 00:00:00'::timestamp,
                       '2014-10-20 23:00:00'::timestamp, '1 hour') AS i
  LEFT JOIN data d ON tsrange(i,i+INTERVAL '1h','[)') @> d.stamp
 GROUP BY r
 ORDER BY r;

Check on SQL Fiddle

I used LEFT JOIN here as I thought you would like to see all ranges, adopt to your needs.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.