I want to take the below JSON file (people.json) in Synapse (Datalake)
{
"people":
[
{ "age": 25,
"id": 2,
"info": {
"name": "John",
"surname": "Smith" }},
{ "dob": "2005-11-04T12:00:00",
"id": 5,
"info": {
"name": "Jane",
"skills": ["SQL","C#","Azure"],
"surname": "Smith"}}
]
}
and query it into the below dataset, in particular the last column, skills, which saves all skills as a JSON array in text.
The reason for saving it like is it will make life easier for me when processing this in a logic app.
| id | name | surname | age | skills |
|---|---|---|---|---|
| 2 | John | Smith | 25 | null |
| 5 | Jane | Smith | null | ["SQL","C#","Azure"] |
To date I have a SQL query which can pull out the id, name, surname and age columns.
But I cannot figure out how to query the skills array into text column.
SELECT
rows.filename() as FileName,
rows.filepath() as FilePath,
ID,
[NAME],
surname,
age
FROM
OPENROWSET(
BULK 'https://<storageaccount>.dfs.core.windows.net/debug/people.json',
FORMAT = 'CSV',
FIELDQUOTE = '0x0b',
FIELDTERMINATOR ='0x0b',
ROWTERMINATOR = '0x0b'
)
WITH (
jsonContent varchar(MAX)
) AS [rows]
cross apply openjson(jsonContent, '$.people')
WITH (
Id INT '$.id',
[name] NVARCHAR(100) '$.info.name',
surname NVARCHAR(100) '$.info.surname',
age INT '$.age'
)
, skills nvarchar(max) '$.info.skills' as json?