1

Is it possible to do something similar to a running sum for json?

I have this table:

   day        id          data
────────────┼───────┼───────────────────
 2016-06-20 │     1 │ {"key0": "value0"}
 2016-06-21 │     1 │ {"key1": "value1"}
 2016-06-22 │     1 │ {"key2": "value2"}

And I would like it to be this table:

   day        id                  data
────────────┼───────┼────────────────────────────────────────────────────
 2016-06-20 │     1 │ {"key0": "value0"}
 2016-06-21 │     1 │ {"key0": "value0", "key1": "value1"}
 2016-06-22 │     1 │ {"key0": "value0", "key1": "value1", "key2": "value2"}

I tried using a window function because the default behavior is similar to this for aggregate functions but I don't know how to properly do it for json.

Is anyone able to help?

2 Answers 2

3

There is no built-in aggregate that concatenates JSONB objects (jsonb_agg() returns an array, not a single JSON value), but it's really easy to create one:

create aggregate jsonb_append(jsonb) 
(
    sfunc = jsonb_concat(jsonb, jsonb),
    stype = jsonb
);

This aggregate can also be used as a window function which does a "running aggregate", so you can do:

select day, id, jsonb_append(data) over (order by day)
from topo
order by day;

Note that JSONB does not preserve the order of the keys. So the ordering of the keys inside the aggregated jsonb value might not exactly be the same as the retrieval order.

If the same key exists in multiple rows, the value of the "last" row (according to the order by) will be retained.

Online example

Sign up to request clarification or add additional context in comments.

Comments

1

As long as you have a way of defining which other rows should be included in each row's "combined" data, it's straightforward to do with a LATERAL.

testdb=# create table t(day date, id bigint, data jsonb);
ERROR:  relation "t" already exists
testdb=# select * from t;
    day     | id |        data        
------------+----+--------------------
 2016-06-20 |  1 | {"key0": "value0"}
 2016-06-21 |  1 | {"key1": "value1"}
 2016-06-22 |  1 | {"key2": "value2"}
(3 rows)

testdb=# SELECT t0.day,                                  
       t0.id,
       j
FROM t t0, LATERAL
(SELECT jsonb_object(keys, vals) j
 FROM (SELECT array_agg(kvset.key) keys,
              array_agg(kvset.value) vals
       FROM
   (SELECT key, value FROM t t1
    CROSS JOIN jsonb_each_text(t1.data) AS r
    WHERE t1.day<=t0.day) AS kvset
    ) AS kvpairs
)_;
    day     | id |                           j                            
------------+----+--------------------------------------------------------
 2016-06-20 |  1 | {"key0": "value0"}
 2016-06-21 |  1 | {"key0": "value0", "key1": "value1"}
 2016-06-22 |  1 | {"key0": "value0", "key1": "value1", "key2": "value2"}

In this case, I use t1.day<=t0.day to indicate that all rows equal to or lower than the given date should be scanned to build the combined object for that date's row.

It's worth noting that I don't do anything to handle conflicting keys in a smart way; this probably does not do The Right Thing when it encounters the same key in multiple rows.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.