3

I've executed the code below in BigQuery

SELECT ( --inner query
         SELECT STRING_AGG(c) FROM t1.array_column c
       ) 
FROM (
        select 1 as f1, ['1','2','3'] as array_column
        union all
        select 2 as f1, ['5','6','7'] as array_column
) t1;

I expected something like

Row|f0_
1  | 1,2,3,4,5,6,7

because there is no GROUP BY in the inner query. So, I'm expecting STRING_AGG to be evaluated on all the lines.

SELECT STRING_AGG(c) FROM t1.array_column c

Instead I'm getting something like this:

Row|f0_
1  |1,2,3
2  |5,6,7

I'm having troubles understand why I have this result

1
  • Your "inner query" is evaluated for every row of the (most) outer query. And that's your other inner query, which has two rows. Commented Jan 1, 2019 at 22:16

2 Answers 2

2

This is your query:

SELECT (SELECT STRING_AGG(c) FROM t1.array_column c
       ) 
FROM (select 1 as f1, ['1', '2', '3'] as array_column
      union all
      select 2 as f1, ['5', '6', '7'] as array_column
     ) t1;

First, I'm surprised it works. I thought you needed unnest():

SELECT (SELECT STRING_AGG(c) FROM UNNEST(t1.array_column) c
       ) 

What is happening? Well, this would be more obvious if you selected f1. Then you would get:

1     1,2,3
2     5,6,7

This should make it more clear. For each row in t1 (and there are two rows), your code is:

  • unnesting the array into rows with a column called c.
  • reaggregating those rows into a string (with no name)

If you want to combine the elements in the arrays, use array_concat_agg():

SELECT array_concat_agg(array_column)
FROM (select 1 as f1, ['1','2','3'] as array_column
      union all
      select 2 as f1, ['5','6','7'] as array_column
     ) t1;

If you want this represented as a string instead of an array, use array_to_string():

SELECT array_to_string(array_concat_agg(array_column), ',')
FROM (select 1 as f1, ['1','2','3'] as array_column
      union all
      select 2 as f1, ['5','6','7'] as array_column
     ) t1;
Sign up to request clarification or add additional context in comments.

3 Comments

The "...For each row in t1..." helped me a lot: The inner query is executed for each line of the outer query. many Thanks. PS: I was also surprised it worked without the unnest
The unnest was optional because the field path resolved to an array. From the documentation In implicit unnesting, array_path must resolve to an ARRAY and the UNNEST keyword is optional
@MassyB . . . I haven't used implicit unnesting in BigQuery. Right now, I'm not inclined to use it, although perhaps I'll get used to it over time.
2

Below is for BigQuery Standard SQL

#standardSQL
SELECT STRING_AGG((SELECT STRING_AGG(c) FROM t1.array_column c)) 
FROM (
  SELECT 1 AS f1, ['1','2','3'] AS array_column UNION ALL
  SELECT 2 AS f1, ['5','6','7'] AS array_column
) t1

and produces

Row f0_  
1   1,2,3,5,6,7    

Note 1: you were almost there - you were just missing extra STRING_AGG that does final grouping of strings created off of respective array in each row

Note 2: because array_column is of ARRAY type it is treated as inner table referenced as t1.array_column as as such - FROM t1.array_column c is equivalent to FROM UNNEST(array_column) c - very cool hidden feature :o)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.