I have the following data:
WITH data as (
SELECT 18 AS value, 1 AS id, "A" AS other_value,
UNION ALL SELECT 20 AS value, 1 AS id, "B",
UNION ALL SELECT 22 AS value, 2 AS id, "C"
UNION ALL SELECT 30 AS value, 3 AS id, "A"
UNION ALL SELECT 37 AS value, 2 AS id, "B"
UNION ALL SELECT 31 AS value, 2 AS id, "C"
UNION ALL SELECT 42 AS value, 1 AS id, "D"
)
I am using the following query
select
FIRST_VALUE(id) over w1 as id
, ARRAY_AGG(value) over w1 as data
, FIRST_VALUE(other_value) over w1 as first_other_data
, LAST_VALUE(other_value) over w1 as last_other_data
from data
WINDOW w1 as (PARTITION BY id order by value ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
And I get
id data first_other_data last_other_data
1 18 A D
20
42
1 18 A D
20
42
1 18 A D
20
42
2 22 C B
31
37
2 22 C B
31
37
2 22 C B
31
37
3 30 A A
But i am getting duplicates that I don't want. I was thinking to use distinct keyword, but bigquery do not like it. My expected result is :
id data first_other_data last_other_data
1 18 A D
20
42
2 22 C B
31
37
3 30 A A
I have found similar questions but not exactly this case. Thanks EDIT: In my attempt to simplify the scenario for this SO question I took out some essential components. I have modified this with a more accurate version of my problem.
