3

I have a SQLite table where one column contains a JSON array containing 0 or more values. Something like this:

id|values
0 |[1,2,3]
1 |[]
2 |[2,3,4]
3 |[2]

What I want to do is "unfold" this into a list of all distinct values contained within the arrays of that column.

To start, I am using the JSON1 extension's json_each function to extract a table of values from a row:

SELECT
  value
FROM
  json_each(
      (
        SELECT
          values
        FROM
          my_table
        WHERE
          id == 2
      )
  )

Where I can vary the id (2, above) to select any row in the table.

Now, I am trying to wrap this in a recursive CTE so that I can apply it to each row across the entire table and union the results. As a first step I replicated (roughly) the results from above as follows:

WITH RECURSIVE result AS (
  SELECT null
  UNION ALL
  SELECT
    value
  FROM
      json_each(
          (
            SELECT
              values
            FROM
              my_table
            WHERE
              id == 2
          )
      )  
)
SELECT * FROM result;

As the next step I had originally planned to make id a variable and increment it (in a similar manner to the first example in the documentation, but haven't been able to get that to work.

I have gone through the other examples in the documentation, but they are somewhat more complex and I haven't been able to distill those down to see how they might apply to this problem.

Can someone provide a simple example of how to solve this (or a similar problem) with a recursive CTE?

Of course, my goal is to solve the problem with or without CTEs so Im also happy to hear if there is a better way...

2 Answers 2

3

You do not need a recursive CTE for this.

To call json_each for multiple source rows, use a join:

SELECT t1.id, t2.value
FROM my_table AS t1
JOIN json_each((SELECT "values" FROM my_table WHERE id = t1.id)) AS t2;
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, your solution helped me with a case where I was targeting values of a nested array. Ended up with two joins, the second one another json_each with t2.value as X argument: join json_each(t2.value, '$.names') as t3.
0

I had a similar problem. In my case the input was a report in json format (using format -f json) which is nested multiple times like this

{
  "formatVersion": 0, 
  "pmdVersion": "7.0.0-rc3", 
  "timestamp": "2023-08-19T01:11:39.497+02:00",
  "files": [
     { "filename": "D:\\A.cls", "violations": [ {v1}, {v2}, {v3} ]},
     { "filename": "D:\\B.cls", "violations": [ {v4}, {v5}, {v6} ]}
  ],
  "suppressedViolations": [],
  "processingErrors": [],
  "configurationErrors": []
}

Each file [f1, f2] inside the array files: [] can have n violations [v1, v2, v3]

The SQL i used to extract the json objects and some properties looks like this

Using a Common Table Expression

WITH CTE1 AS(
 SELECT PMD_Reports.id, PMD_Reports.report_date, jsonEachFiles.value as File
     FROM PMD_Reports
          ,json_each(PMD_Reports.format_json, '$.files') AS jsonEachFiles
)
SELECT 
    CTE1.id, 
    CTE1.report_date, 
    substr(json_extract(CTE1.File, '$.filename'), 28) as filename,
    json_extract(jsonEachViolation.value, '$.ruleset') as ruleset,
    json_extract(jsonEachViolation.value, '$.rule') as rule,  
    json_extract(jsonEachViolation.value, '$.priority') as priority,
    json_extract(jsonEachViolation.value, '$.description') as description       
    FROM CTE1
         ,json_each(CTE1.File, '$.violations') AS jsonEachViolation

Using a subquery

SELECT ruleset, rule, priority, Count(Distinct filename) as files_affected FROM
(
  SELECT 
     T1.id,
     substr(json_extract(T1.Files, '$.filename'), 28) as filename, 
     json_extract(T2.value, '$.ruleset') as ruleset,
     json_extract(T2.value, '$.rule') as rule,  
     json_extract(T2.value, '$.priority') as priority,
     json_extract(T2.value, '$.description') as description
    FROM (
          SELECT PMD_Reports.id, jsonEachFiles.Value as Files
          FROM PMD_Reports
          ,json_each(PMD_Reports.format_json, '$.files') AS jsonEachFiles           
    ) AS T1, json_each(T1.Files, '$.violations') as T2
) as T3
GROUP BY T3.ruleset, T3.rule, T3.priority

Broken down

  • json_each(PMD_Reports.format_json, '$.files') unwraps the files-array and results in one row per file.
  • json_each(T1.Files, '$.violations') unwraps each violation for every file
  • json_extract(T2.value, '$.priority') as priority, reads the property priority of a violation (T2.value contains the violation object)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.