2

I'm looking to see how to flatten data nested in a JSONB column. As an example, say we have the table users with user_id(int) and siblings(JSONB)

With rows like:

id | JSONB
---------------------
1  | {"brother": {"first_name":"Sam", "last_name":"Smith"}, "sister": {"first_name":"Sally", "last_name":"Smith"}
2  | {"sister": {"first_name":"Jill"}}

I'm looking for a query that will return a response like:

id | sibling   | first_name | last_name
-------------------------------------
1  | "brother" | "Sam"      | "Smith"
1  | "sister"  | "Sally"    | "Smith"
2  | "sister"  | "Jill"     | null
6
  • postgres version? Commented Feb 22, 2017 at 22:29
  • @VaoTsun version 9.4.9 Commented Feb 22, 2017 at 22:37
  • json keys first_name and last_name are constant?.. Commented Feb 22, 2017 at 22:46
  • @VaoTsun no, the keys could change Commented Feb 22, 2017 at 22:48
  • 1
    @JacobMurphy If the keys can change, meaning first_name and such, this is a bad candidate for turning into fixed columns. Can you define a set of keys you want to turn into columns and leave the rest as JSON? Commented Feb 22, 2017 at 23:42

1 Answer 1

2

I develop to this use it in psql. To check code I create small view t1:

CREATE VIEW t1 AS (
       SELECT 1 AS id, '{"brother": {"first_name":"Sam", "last_name":"Smith"}, "sister": {"first_name":"Sally", "last_name":"Smith"}}'::jsonb AS jsonb
 UNION SELECT 2, '{"sister": {"first_name":"Jill", "last_name":"Johnson"}}'
 UNION SELECT 3, '{"sister": {"first_name":"Jill", "x_name":"Johnson"}}'
);

The first task is to found list of possible key:

WITH fields AS (
     SELECT DISTINCT jff.key
       FROM t1,
            jsonb_each(jsonb) AS jf,
            jsonb_each(jf.value) AS jff
)
SELECT * FROM fields;

The result is:

    key     
------------
 first_name
 last_name
 x_name

The next step is generate queries:

SELECT 'SELECT id, jf.key as sibling, ' || (
    WITH fields AS (
         SELECT DISTINCT jff.key
           FROM t1,
                jsonb_each(jsonb) AS jf,
                jsonb_each(jf.value) AS jff
    )
    SELECT string_agg('jf.value->>''' || key || ''' as "' || key || '"', ',' ORDER BY key)
      FROM fields
)
|| ' FROM t1, jsonb_each(jsonb) AS jf ORDER BY 1, 2, 3;' AS cmd;

It returns:

                                                                                  cmd                                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 SELECT id, jf.key as sibling,jf.value->>'first_name' as "first_name",jf.value->>'last_name' as "last_name",jf.value->>'x_name' as "x_name" FROM t1, jsonb_each(jsonb) AS jf ORDER BY 1, 2, 3;
(1 row)

To set result as psql variable I use gset:

\gset

After that you can call query:

:cmd

 id | sibling | first_name | last_name | x_name  
----+---------+------------+-----------+---------
  1 | brother | Sam        | Smith     | 
  1 | sister  | Sally      | Smith     | 
  2 | sister  | Jill       | Johnson   | 
  3 | sister  | Jill       |           | Johnson
(4 rows)

To run it from external languages you can create postgres function than return SQL command:

CREATE OR REPLACE FUNCTION build_query(IN tname text, OUT cmd text)  AS $sql$
BEGIN 
    EXECUTE $cmd$
            SELECT 'SELECT id, jf.key as sibling, ' || (
                    WITH fields AS (
                        SELECT DISTINCT jff.key
                          FROM t1,
                               jsonb_each(jsonb) AS jf,
                               jsonb_each(jf.value) AS jff
                    )
                    SELECT string_agg('jf.value->>''' || key || ''' as "' || key || '"', ',' ORDER BY key)
                      FROM fields
                )
        || ' FROM $cmd$ || quote_ident(tname) || $cmd$ , jsonb_each(jsonb) AS jf ORDER BY 1, 2, 3;'$cmd$ INTO cmd;
    RETURN;
END;
$sql$ LANGUAGE plpgsql;

SELECT * FROM build_query('t1');
                                                                                               cmd                                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 SELECT id, jf.key as sibling, jf.value->>'first_name' as "first_name",jf.value->>'last_name' as "last_name",jf.value->>'x_name' as "x_name" FROM t1 , jsonb_each(jsonb) AS jf ORDER BY 1, 2, 3;
(1 row)
Sign up to request clarification or add additional context in comments.

1 Comment

nice, but this is not recursive and do select only level 0+1 json nodes, or hang if one row miss a 0+1 json level:

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.