0

I am joining 2 tables which both have hundreds of similarly named columns. I would like to change all of the column names in each table to include the table name. To keep the query simple, I do not want to call out each column name explicitly. The query below accomplishes this goal. However, the below query is extremely slow when applied to large datasets. I assume that the slow performance is due to the fact that the replace_regex() function is running on the entire dataset. Is there another way to achieve the same result while improving performance on larger datasets?

let T1 = datatable (Key:string , Col2:string , Col3:string )
[
  "1", "b", "c",
  "2", "e", "f",
  "3", "h", "i"
] 
| project PackedRecord = todynamic(replace_regex(tostring(pack_all()), '"([a-zA-Z0-9_]*)":"', @'"T1_\1":"'))
| evaluate bag_unpack(PackedRecord);
let T2 = datatable (Key:string , Col2:string , Col3:string )
[
  "1", "B", "C",
  "2", "E", "F",
  "4", "H", "I"
] 
| project PackedRecord = todynamic(replace_regex(tostring(pack_all()), '"([a-zA-Z0-9_]*)":"', @'"T2_\1":"'))
| evaluate bag_unpack(PackedRecord);
let JoinTable = T1 | join kind=inner T2 on $left.T1_Key == $right.T2_Key;
JoinTable

Previous Question for Reference

Rename all column names by adding a string in KQL/Kusto/Data Explorer

1 Answer 1

1

You can achieve the same results without using replace_regex() and relying on OutputColumnPrefix while doing the bag_unpack operation. Modified a couple of lines from your original kql snippet.

Based on the kusto documentation the OutputColumnPrefix argument allows passing a common prefix to add to all columns produced by the plugin.

let T1 = datatable (Key:string , Col2:string , Col3:string )
[
  "1", "b", "c",
  "2", "e", "f",
  "3", "h", "i"
]
| project Key, PackedRecord = pack_all()
| evaluate bag_unpack(PackedRecord, OutputColumnPrefix = "T1_") | project-away T1_Key; // get rid of additional key;
let T2 = datatable (Key:string , Col2:string , Col3:string )
[
  "1", "B", "C",
  "2", "E", "F",
  "4", "H", "I"
]
| project Key, PackedRecord = pack_all()
| evaluate bag_unpack(PackedRecord, OutputColumnPrefix = "T2_") | project-away T2_Key; // get rid of additional key;
T1 
| join kind=inner T2 on $left.Key == $right.Key | project-away Key1 // get rid of second key after join
Sign up to request clarification or add additional context in comments.

1 Comment

degant, this is awesome! At a minimum, this cleans up the query considerably. Also, appears to be improving performance as well. Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.