1

I have a series of rows in a PostgreSQL table which look like this:

-[ RECORD 1 ]---------------------------------------------------------------------
student     | e04c0ae4709340cb8e03c52f444e723f
group       | 1
subgroup    | 1
variable    | VAR1
status      | { "track_A" : "Done", "track_B" : "Done", "track_C" : "To Do" }
-[ RECORD 2 ]---------------------------------------------------------------------
student     | e04c0ae4709340cb8e03c52f444e723f
group       | 1
subgroup    | 1
variable    | VAR2
status      | { "track_A" : "To Do", "track_B" : "Done", "track_C" : "To Do" }
-[ RECORD 3 ]---------------------------------------------------------------------
student     | 849d1e6a0c2b4530a2b550829df94556
group       | 0
subgroup    | 1
variable    | VAR3
status      | { "track_A" : "Done", "track_B" : "To Do", "track_C" : "To Do" }

I would like to group them by student, group and subgroup and get a count status for each track. Something like:

-[ RECORD 1 ]---------------------------------------------------------------------
student     | e04c0ae4709340cb8e03c52f444e723f
group       | 1
subgroup    | 1
totals      | { "track_A" : {"done": 1, "to_do": 1}, {"track_B" : {"done": 0, "to_do": 2}, "track_C" : {"done": 0, "to_do": 2} }

The issue is that the number of tracks can vary. I do know their names, but they are not static, so I cannot do a simple aggregation. Any suggestions how I could write this in PostgreSQL (9.5)? I do not want to iterate over all the tracks and aggregate, as the operation will take some time.

1 Answer 1

2

You could use json_each_text to "unest" values and json_object_agg to combine it again.

Data:

DROP TABLE IF EXISTS tab;
CREATE TABLE tab(student VARCHAR(36), "group" INT, subgroup INT,
                 variable VARCHAR(20), status JSON);        

INSERT INTO tab(student, "group", subgroup, variable, status)
VALUES
('e04c0ae4709340cb8e03c52f444e723f',1,1,'VAR1'
,'{ "track_A" : "Done", "track_B" : "Done", "track_C" : "To Do" }'),
('e04c0ae4709340cb8e03c52f444e723f',1,1,'VAR2'
, '{ "track_A" : "To Do", "track_B" : "Done", "track_C" : "To Do" }')
,('849d1e6a0c2b4530a2b550829df94556',0,1,'VAR3'
,'{ "track_A" : "Done", "track_B" : "To Do", "track_C" : "To Do" }');

Query:

WITH cte AS
(
   SELECT student, "group", subgroup, k
     ,COUNT(CASE WHEN v='Done'  THEN 1 END) AS Done
     ,COUNT(CASE WHEN v='To Do' THEN 1 END) AS To_do
   FROM tab
   ,LATERAL json_each_text(status) s(k,v)
   GROUP BY student, "group", subgroup, k  
), cte2 AS
(
  SELECT student, "group", subgroup, k, json_object_agg(s.status, s.cnt) AS j
  FROM cte
  ,LATERAL (VALUES('Done', Done),('To Do', To_Do)) AS s(status, cnt)
  GROUP BY student, "group", subgroup, k   
)
SELECT student, "group", subgroup
      ,json_object_agg(k, j) AS totals
FROM cte2
GROUP BY student, "group", subgroup;

Output:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.