Do a "running json concatenation" in postgres query

Question

Is it possible to do something similar to a running sum for json?

I have this table:

   day        id          data
────────────┼───────┼───────────────────
 2016-06-20 │     1 │ {"key0": "value0"}
 2016-06-21 │     1 │ {"key1": "value1"}
 2016-06-22 │     1 │ {"key2": "value2"}

And I would like it to be this table:

   day        id                  data
────────────┼───────┼────────────────────────────────────────────────────
 2016-06-20 │     1 │ {"key0": "value0"}
 2016-06-21 │     1 │ {"key0": "value0", "key1": "value1"}
 2016-06-22 │     1 │ {"key0": "value0", "key1": "value1", "key2": "value2"}

I tried using a window function because the default behavior is similar to this for aggregate functions but I don't know how to properly do it for json.

Is anyone able to help?

user330315 · Accepted Answer · 2018-11-02 08:00:54Z

3

There is no built-in aggregate that concatenates JSONB objects (jsonb_agg() returns an array, not a single JSON value), but it's really easy to create one:

create aggregate jsonb_append(jsonb) 
(
    sfunc = jsonb_concat(jsonb, jsonb),
    stype = jsonb
);

This aggregate can also be used as a window function which does a "running aggregate", so you can do:

select day, id, jsonb_append(data) over (order by day)
from topo
order by day;

Note that JSONB does not preserve the order of the keys. So the ordering of the keys inside the aggregated jsonb value might not exactly be the same as the retrieval order.

If the same key exists in multiple rows, the value of the "last" row (according to the order by) will be retained.

Online example

answered Nov 2, 2018 at 8:00

user330315

Sign up to request clarification or add additional context in comments.

Comments

AdamKG · Accepted Answer · 2018-11-02 05:40:13Z

As long as you have a way of defining which other rows should be included in each row's "combined" data, it's straightforward to do with a LATERAL.

testdb=# create table t(day date, id bigint, data jsonb);
ERROR:  relation "t" already exists
testdb=# select * from t;
    day     | id |        data        
------------+----+--------------------
 2016-06-20 |  1 | {"key0": "value0"}
 2016-06-21 |  1 | {"key1": "value1"}
 2016-06-22 |  1 | {"key2": "value2"}
(3 rows)

testdb=# SELECT t0.day,                                  
       t0.id,
       j
FROM t t0, LATERAL
(SELECT jsonb_object(keys, vals) j
 FROM (SELECT array_agg(kvset.key) keys,
              array_agg(kvset.value) vals
       FROM
   (SELECT key, value FROM t t1
    CROSS JOIN jsonb_each_text(t1.data) AS r
    WHERE t1.day<=t0.day) AS kvset
    ) AS kvpairs
)_;
    day     | id |                           j                            
------------+----+--------------------------------------------------------
 2016-06-20 |  1 | {"key0": "value0"}
 2016-06-21 |  1 | {"key0": "value0", "key1": "value1"}
 2016-06-22 |  1 | {"key0": "value0", "key1": "value1", "key2": "value2"}

In this case, I use t1.day<=t0.day to indicate that all rows equal to or lower than the given date should be scanned to build the combined object for that date's row.

It's worth noting that I don't do anything to handle conflicting keys in a smart way; this probably does not do The Right Thing when it encounters the same key in multiple rows.

Collectives™ on Stack Overflow

Do a "running json concatenation" in postgres query

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related