I have time series data for the stock exchange which is only open between 9am and 4pm. I wish to disregard any rows that fall outside these bounds.
To know if I need to make API calls I am comparing a generate_series for a given time period with the time series data I have stored in my db.
I am doing a LEFT JOIN against the generate_series results and my av_intraday tables.
This is the query...
SELECT *
FROM generate_series(date_trunc('day', localtimestamp - interval '1 day')
, localtimestamp
, interval '15 min') g(start_time)
LEFT JOIN av_intraday av ON av.date = g.start_time
AND extract(hour from g.start_time) >= 9
AND extract(hour from g.start_time) <= 16
ORDER BY g.start_time;
This is the resulting data...
av_intraday
start_time | symbol | date | open | high | low | close | volume
---------------------+--------+---------------------+-----------+-----------+-----------+-----------+---------
...
2019-07-30 08:45:00 | | | | | | |
2019-07-30 09:00:00 | | | | | | |
2019-07-30 09:15:00 | | | | | | |
2019-07-30 09:30:00 | | | | | | |
2019-07-30 09:45:00 | fb | 2019-07-30 09:45:00 | 194.95 | 195.83 | 194.54 | 195.58 | 2004674
2019-07-30 09:45:00 | amzn | 2019-07-30 09:45:00 | 1891.115 | 1905.3 | 1883.48 | 1904.28 | 564821
2019-07-30 09:45:00 | goog | 2019-07-30 09:45:00 | 1225.41 | 1234.87 | 1223.4301 | 1232.0699 | 203333
2019-07-30 09:45:00 | tsla | 2019-07-30 09:45:00 | 234 | 235.54 | 233.72 | 235.07 | 207546
2019-07-30 09:45:00 | aapl | 2019-07-30 09:45:00 | 208.88 | 208.93 | 208.335 | 208.6148 | 602413
...
2019-07-30 16:00:00 | amzn | 2019-07-30 16:00:00 | 1897.42 | 1899.9 | 1896.61 | 1898.33 | 152442
2019-07-30 16:00:00 | goog | 2019-07-30 16:00:00 | 1225.09 | 1226.34 | 1223.3 | 1223.4 | 169110
2019-07-30 16:00:00 | tsla | 2019-07-30 16:00:00 | 241.94 | 242.3 | 241.93 | 242.2 | 151572
2019-07-30 16:00:00 | aapl | 2019-07-30 16:00:00 | 208.94 | 208.94 | 208.3436 | 208.685 | 1096338
2019-07-30 16:15:00 | | | | | | |
2019-07-30 16:30:00 | | | | | | |
2019-07-30 16:45:00 | | | | | | |
2019-07-30 17:00:00 | | | | | | |
2019-07-30 17:15:00 | | | | | | |
2019-07-30 17:30:00 | | | | | | |
...
Data should look like the following...
av_intraday
start_time | symbol | date | open | high | low | close | volume
---------------------+--------+---------------------+-----------+-----------+-----------+-----------+---------
2019-07-30 09:00:00 | | | | | | |
2019-07-30 09:15:00 | | | | | | |
2019-07-30 09:30:00 | | | | | | |
2019-07-30 09:45:00 | fb | 2019-07-30 09:45:00 | 194.95 | 195.83 | 194.54 | 195.58 | 2004674
2019-07-30 09:45:00 | amzn | 2019-07-30 09:45:00 | 1891.115 | 1905.3 | 1883.48 | 1904.28 | 564821
2019-07-30 09:45:00 | goog | 2019-07-30 09:45:00 | 1225.41 | 1234.87 | 1223.4301 | 1232.0699 | 203333
2019-07-30 09:45:00 | tsla | 2019-07-30 09:45:00 | 234 | 235.54 | 233.72 | 235.07 | 207546
2019-07-30 09:45:00 | aapl | 2019-07-30 09:45:00 | 208.88 | 208.93 | 208.335 | 208.6148 | 602413
...
2019-07-30 16:00:00 | amzn | 2019-07-30 16:00:00 | 1897.42 | 1899.9 | 1896.61 | 1898.33 | 152442
2019-07-30 16:00:00 | goog | 2019-07-30 16:00:00 | 1225.09 | 1226.34 | 1223.3 | 1223.4 | 169110
2019-07-30 16:00:00 | tsla | 2019-07-30 16:00:00 | 241.94 | 242.3 | 241.93 | 242.2 | 151572
2019-07-30 16:00:00 | aapl | 2019-07-30 16:00:00 | 208.94 | 208.94 | 208.3436 | 208.685 | 1096338
2019-07-30 16:15:00 | | | | | | |
2019-07-30 16:30:00 | | | | | | |
2019-07-30 16:45:00 | | | | | | |
...
Got the desired result with the following...
SELECT *
FROM (
SELECT g.start_time as g_start_time, av.symbol, av.open, av.high, av.low, av.close, av.volume
FROM generate_series(date_trunc('day', localtimestamp - interval '1 day')
, localtimestamp
, interval '15 min') g(start_time)
LEFT JOIN av_intraday av ON av.date = g.start_time
ORDER BY g.start_time
) t
WHERE extract(hour from g_start_time) >= 9
AND extract(hour from g_start_time) <= 16;
But I'm not sure why it wouldn't work as it was. Can someone explain? Thank you.