The problem I need to solve:
In order to calculate the number of hours per day that are used for (public) holidays or days of illness, the average working hours are used from the previous 3 months (with a starting value of 8 hours per day).
The tricky part is that the calculated value of the previous month will need to be factored in, meaning if there was a public holiday last month, which had been assigned a calculated value of 8.5 hours, these calculated hours will influence the average working hours per day for that last month, which then is being used to assigned working hours to current months' holidays.
So far I only have come up with the following, which doesn't factor in the row-by-row calculation, yet:
WITH
const (h_target, h_extra) AS (VALUES (8.0, 20)),
monthly_sums (c_month, d_work, d_off, h_work) AS (VALUES
('2018-12', 16, 5, 150.25),
('2019-01', 20, 3, 171.25),
('2019-02', 15, 5, 120.5)
),
calc AS (
SELECT
ms.*,
(ms.d_work + ms.d_off) AS d_total,
(ms.h_work + ms.d_off * const.h_target) AS h_total,
(avg((ms.h_work + ms.d_off * const.h_target) / (ms.d_work + ms.d_off))
OVER (ORDER BY ms.c_month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW))::numeric(10,2)
AS h_off
FROM monthly_sums AS ms
CROSS JOIN const
)
SELECT
calc.c_month,
calc.d_work,
calc.d_off,
calc.d_total,
calc.h_work,
calc.h_off,
(d_off * lag(h_off, 1, const.h_target) OVER (ORDER BY c_month)) AS h_off_sum,
(h_work + d_off * lag(h_off, 1, const.h_target) OVER (ORDER BY c_month)) AS h_sum
FROM calc CROSS JOIN const;
...giving the following result:
c_month | d_work | d_off | d_total | h_work | h_off | h_off_sum | h_sum
---------+--------+-------+---------+--------+-------+-----------+--------
2018-12 | 16 | 5 | 21 | 150.25 | 9.06 | 40.0 | 190.25
2019-01 | 20 | 3 | 23 | 171.25 | 8.77 | 27.18 | 198.43
2019-02 | 15 | 5 | 20 | 120.5 | 8.52 | 43.85 | 164.35
(3 rows)
This calculates correctly for the first row and for the second row for columns that rely on previous row values (lag) but the average hours per day calculation is obviously wrong as I couldn't figure out how to feed the current row value (h_sum) back into the calculation for the new h_off.
The desired result should be as follows:
c_month | d_work | d_off | d_total | h_work | h_off | h_off_sum | h_sum
---------+--------+-------+---------+--------+-------+-----------+--------
2018-12 | 16 | 5 | 21 | 150.25 | 9.06 | 40.0 | 190.25
2019-01 | 20 | 3 | 23 | 171.25 | 8.84 | 27.18 | 198.43
2019-02 | 15 | 5 | 20 | 120.5 | 8.64 | 44.2 | 164.7
(3 rows)
...meaning h_off is used for next months' h_off_sum and resulting h_sum and h_sum's of available months (at most three) in turn result into the calculation of current months' h_off (essentially avg(h_sum / d_total) over up to three months).
So, actual calculation is:
c_month | calculation | h_off
---------+----------------------------------------------------+-------
| | 8.00 << initial
.---------------------- uses ---------------------^
2018-12 | ((190.25 / 21)) / 1 | 9.06
.------------ uses ---------------^
2019-01 | ((190.25 / 21) + (198.43 / 23)) / 2 | 8.84
.--- uses --------^
2019-02 | ((190.25 / 21) + (198.43 / 23) + (164.7 / 20)) / 3 | 8.64
P.S.: I am using PostgreSQL 11, so I have the latest features at hands if that makes any difference.