0

I have a data structure like below:

Products

| _id |  name  | available_in_region_id |
-----------------------------------------
| d22 |  shoe  | c32, a43, x53          |
| t64 |  hat   | c32, f42               |

Regions

| _id |  name       |
---------------------
| c32 |  london     |
| a43 |  manchester |
| x53 |  bristol    |
| f42 |  liverpool  |

I want to look up the array of "available_in_region_id" ids and replace them by the region name to result in a table like below:

| _id |  name  | available_in_region_name    |
----------------------------------------------
| d22 |  shoe  | london, manchester, bristol |
| t64 |  hat   | london, liverpool           |

What is the best way to do this using standard SQL?

Thanks,

A

1 Answer 1

2

Below is for BigQuery Standard SQL

#standardSQL
SELECT p._id, p.name, 
  STRING_AGG(r.name, ', ' ORDER BY OFFSET) AS available_in_region_name
FROM `project.dataset.Products` p,
UNNEST(SPLIT(available_in_region_id, ', ')) rid WITH OFFSET
LEFT JOIN `project.dataset.Regions` r
ON rid = r._id
GROUP BY _id, name

You can test, play with above using sample data from your question as in below example

#standardSQL
WITH `project.dataset.Products` AS (
  SELECT 'd22' _id, 'shoe' name, 'c32, a43, x53' available_in_region_id UNION ALL
  SELECT 't64', 'hat', 'c32, f42'
), `project.dataset.Regions` AS (
  SELECT 'c32' _id, 'london' name UNION ALL
  SELECT 'a43', 'manchester' UNION ALL
  SELECT 'x53', 'bristol' UNION ALL
  SELECT 'f42', 'liverpool' 
)
SELECT p._id, p.name, 
  STRING_AGG(r.name, ', ' ORDER BY OFFSET) AS available_in_region_name
FROM `project.dataset.Products` p,
UNNEST(SPLIT(available_in_region_id, ', ')) rid WITH OFFSET
LEFT JOIN `project.dataset.Regions` r
ON rid = r._id
GROUP BY _id, name  

with output

Row _id name    available_in_region_name     
1   d22 shoe    london, manchester, bristol  
2   t64 hat     london, liverpool   
Sign up to request clarification or add additional context in comments.

7 Comments

Thanks for the answer but maybe the question was a bit misleading, but the IDs are not numerical, they are mongodb 12 bit objectIds, more like "3d42nd3nd32dd2278dn2". I'm assuming this won't work using "rid WITH OFFSET"
see updated answer to match changes in your question
Thanks. I get an error however: No matching signature for function SPLIT for argument types: ARRAY<STRUCT<value STRING>>. The field in question is type RECORD mode REPEATED
Sorry I am not from an SQL background so it's all a bit of a black box to me.
sure, I will. to vote up the question you should click on up-side arrow to the left of the answer - please do
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.