0

So I have a dataset and it has a column which has data like this:

ID     values
0001   [
       {
        prices: {
          ia: '20K+',
          ln: '3K-10K'
        },
        formats: [
          'n',
          'ia'
        ],
        id: 'c8f4f498-1cfeaf1-455a-a5ac-310191wefw959583',
        image_url: 'file.jpg',
        slug: 'test1'
      },
       {
        prices: {
          ia: '20K+',
          ln: '3K-10K'
        },
        formats: [
          'n',
          'ia'
        ],
        id: 'c8f4f4wfwe98-1ca1-455a-a5ac-3101919wfewf59583',
        image_url: 'file.jpg',
        slug: 'test3'
      }
    ]
0002   [
       {
        prices: {
          ia: '20K+',
          ln: '3K-10K'
        },
        formats: [
          'n',
          'ia'
        ],
        id: 'c8f4feeee498-1ca1-455a-a5ac-3101919fwewf59583',
        image_url: 'file.jpg',
        slug: 'test2'
      }

All I care about in this variable is the slug. But different ID's have different number of slugs. How could I get a new df from this data to show me every slug as well as the count.

So the result I want:

  ID              slugs  slug_count
0001  ['test1','test3']           2
0002          ['test2']           1

1 Answer 1

3

Flatten and group:

with data as (
select $1 id, parse_json($2) j
from values(1, '[{"slug":"a"}]'), (2, '[{"slug":"a1"},{"slug":"a2"}]')
)

select id, array_agg(x.value:slug) slugs, count(*) slug_count
from data, table(flatten(j)) x
group by id

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.