0

I've got a csv extract of data from a system in this kind of format (but with 16 question, answer and score groups - not just 4):

ClientID MemberID DateCompleted Overall Score Category1 Summary Category2 Summary Category3 Summary Question1 Answer1 Question2 Answer2 Score 2 Question3 Answer3 Score 3 Question4 Answer4 Score 4
Client1 [email protected] 01/01/2023 12:00:00 100 100 100 100 How Old Are You? 25 What's your favourite colour? Blue 5 How long have you owned your house? 4 Years 4 Can you Sail? Yes 1
Client1 [email protected] 01/02/2023 12:00:00 36 61 20 27 How Old Are You? 30 What's your favourite colour? Green 4 How long have you owned your house? 10 Years 6 Can you Sail? No 0
Client2 [email protected] 01/01/2023 12:00:00 63 83 60 47 How Old Are You? 50 What's your favourite colour? Yellow 3 How long have you owned your house? Over 20 Years 10 Can you Sail? Yes 1
Client2 [email protected] 01/02/2023 13:00:00 62 80 60 47 How Old Are You? 50 What's your favourite colour? Yellow 3 How long have you owned your house? Over 20 Years 10 Can you Sail? Yes 1

It's questionnaire data and I haven't got any say over the format. One quirk is that MemberID is the unique identifier for a person, and a person can complete the questionnaire multiple times (so MemberID and DateCompleted make a row unique). That also means that a unique member could have a different age recorded on different rows if they've done the questionnaire years apart. Question 1 will always be 'how old are you', Question 2 will always be 'What's your favourite colour' etc.

So. I've been asked to get it into power bi and report on it, including Age ranges (e.g. less than 30 / 30-49 / over 50) I've got a solution using power query that's got it into a reportable model (outlined below), but it feels a bit clunky and I'm sure there's a neater / more efficient way of doing it.

I want to transform it into this model: Power Bi Model

with the tables looking like this:

Questions:

QuestionID Question
1 How old are you?
2 What's your favourite colour?
3 How long have you owned your house?
4 Can you Sail?

Member:

ClientID MemberID Age AgeRange1 AgeRangeOrder1 AgeRange2 AgeRangeOrder2
Client1 [email protected] 25 Under 30 1 20-30 1
Client1 [email protected] 30 30-50 2 30-40 2
Client2 [email protected] 50 over 50 3 50-60 4
Client2 [email protected] 50 over 50 3 50-60 4

AnswerHistory:

MemberID MemberDatekey DateCompleted QuestionID Answer Score
[email protected] [email protected]/01/23 12:00:00 01/01/2023 12:00 1 25 0
[email protected] [email protected]/01/23 12:00:00 01/01/2023 12:00 2 Blue 5
[email protected] [email protected]/01/23 12:00:00 01/01/2023 12:00 3 4 Years 4
[email protected] [email protected]/01/23 12:00:00 01/01/2023 12:00 4 Yes 1
[email protected] [email protected]/02/23 12:00:00 01/02/2023 12:00 1 30 0
[email protected] [email protected]/02/23 12:00:00 01/02/2023 12:00 2 Green 4
[email protected] [email protected]/02/23 12:00:00 01/02/2023 12:00 3 10 Years 6
[email protected] [email protected]/02/23 12:00:00 01/02/2023 12:00 4 No 0
[email protected] [email protected]/01/23 12:00:00 01/01/2023 12:00 1 50 0
[email protected] [email protected]/01/23 12:00:00 01/01/2023 12:00 2 Yellow 3
[email protected] [email protected]/01/23 12:00:00 01/01/2023 12:00 3 Over 20 Years 10
[email protected] [email protected]/01/23 12:00:00 1/01/2023 12:00 4 No 1
[email protected] [email protected]/02/23 13:00:00 01/02/2023 13:00 1 50 0
[email protected] [email protected]/02/23 13:00:00 01/02/2023 13:00 2 Yellow 3
[email protected] [email protected]/02/23 13:00:00 01/02/2023 13:00 3 Over 20 Years 10
[email protected] [email protected]/02/23 13:00:00 01/02/2023 13:00 4 Yes 1

MemberSummaries:

memberid memberdatekey datecompleted overall score Category1 Summary Category2 Summary Category3 Summary
[email protected] [email protected]/01/23 12:00:00 01/01/2023 100 100 100 100
[email protected] [email protected]/02/23 12:00:00 01/02/2023 36 61 20 27
[email protected] [email protected]/01/23 12:00:00 01/01/2023 63 83 60 47
[email protected] [email protected]/02/23 13:00:00 01/02/2023 62 80 60 47

And the steps I took to get there in power query are:

  1. Create a new 'questions' table with a 'QuestionID' and a Question. Hardcoded this in from the questions but would like to make this dynamic.

  2. Duplicate the Main query, remove all the columns except memberid, date and Question1 / Answer 1

  3. Insert a 'QuestionID column' and fill it with "1"

  4. Repeat steps 2 & 3 for the other 15 questions so I end up with 16 queries (Q1 - Q16) with memberid, date, questionid, answer & score.

  5. Append Q1 - Q16 into a new query - "AnswerHistory"

  6. Merge memberid & date to get a unique key on AnswerHistory

  7. Duplicate the main query, remove all the columns except member identifiers, date and the the summary scores - new Query -"MemberSummaries"

  8. Merge memberID & date to get a unique key on MemberSummaries

  9. Duplicate the main query, remove everything except the member identifiers, client and age.

  10. Group to remove duplicates - new Query - "Member"

  11. make some new columns to split by various age groups **

    **I just realised I probably need the age stuff on MemberSummaries because it could change over time & if it does then I won't get unique members.

So this looks like a lot of steps

List of queries

Having loads of queries to split up the questions & re-merge them into something sensible (to me) seems like overkill.

Is there a way to pivot the original data into the "AnswerHistory" format in one query rather than in 16 separate ones?

Cheers

2
  • It would be simpler if you edit your question to show what the table should look like in Power Query at the end of your transformations, giving your starting data. You should focus on that first. As it is now, you are asking multiple questions and also asking for a judgement call, rather than posing a programming problem. Commented Jun 9, 2023 at 11:57
  • Fair point. I'll edit now. ta Commented Jun 9, 2023 at 11:59

1 Answer 1

0

Edited to include the custom function
You can generate the "AnswerHistory" table from your original table with just a single query.

Algorithm should be explained by the notes in the query

let

//read in the initial data table
    Source = Excel.CurrentWorkbook(){[Name="Table8"]}[Content],

//Remove unneeded columns
    #"Cols to Remove" = List.Select(Table.ColumnNames(Source), each _ = "Overall Score" or _ = "ClientID" or Text.EndsWith(_,"Summary")),
    #"Remove Cols" = Table.RemoveColumns(Source,#"Cols to Remove"),

//Add the memberdatekey column
//   This is not needed to develop this particular table
//and set it into the proper position
    #"Add MemberDatekey" = Table.AddColumn(#"Remove Cols", "MemberDatekey", 
        each [MemberID] & DateTime.ToText([DateCompleted],"MM/dd/yyyy" & "#(lf)" & "hh:mm"), type text),
    #"Reordered Columns" = Table.ReorderColumns(#"Add MemberDatekey", {"MemberID", "MemberDatekey"} & List.RemoveFirstN(Table.ColumnNames(#"Remove Cols"),1)),

//Unpivot columns except for the first three
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Reordered Columns", {"MemberID", "MemberDatekey", "DateCompleted"}, "Attribute", "Value"),

//Split column contents so as to create unique values to pivot on
//Note renaming of the split columns in the function from the default
    #"Split Column by Character Transition" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", 
        Splitter.SplitTextByCharacterTransition((c) => not List.Contains({"0".."9"}, c), {"0".."9"}), {"Attribute", "QuestionID"}),

//Using a custom function, Pivot the "Attribute" column
//  custom function needed because the basic Pivot with no aggregation function will return an error.
//  This can also be done using native functions, if the custom function is too slow
    #"Pivot No Aggregation" = fnPivotNoAggregation(#"Split Column by Character Transition","Attribute","Value"),

//Remove the Question text column as it is no longer needed
    #"Removed Columns" = Table.RemoveColumns(#"Pivot No Aggregation",{"Question"}),

//Set the data types for the columns
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"MemberID", type text}, {"MemberDatekey", type text}, {"DateCompleted", type datetime}, {"QuestionID", Int64.Type}, {"Answer", type any}, {"Score ", Int64.Type}}),

//Replace the null score generated for "age" by 0 as per your spec
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type",null,0,Replacer.ReplaceValue,{"Score "})
in
    #"Replaced Value"

Custom Function
Enter as Blank Query and Rename: fnPivotNoAggregation

//credit: Cam Wallace  https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/

//Rename:  fnPivotNoAggregation

(Source as table,
    ColToPivot as text,
    ColForValues as text)=> 

let
     PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
     #"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
 
    TableFromRecordOfLists = (rec as record, fieldnames as list) =>
    
    let
        PartialRecord = Record.SelectFields(rec,fieldnames),
        RecordToList = Record.ToList(PartialRecord),
        Table = Table.FromColumns(RecordToList,fieldnames)
    in
        Table,
 
    #"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
    #"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
    #"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
    #"Expanded Values"

If you desire the clients to come out in a specific order, some sorting steps may need to be added

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

Hi, Thanks for that - there's a lot of stuff for me to look at - a couple of lightbulb moments about how power bi works as well which is cool. One question - the custom query - fnPivotNoAggregation - I assumer that's using native functions wrapped together - so if it's too slow you could achieve the same thing by using the native functions one by one?
@yonabout It seems I neglected to include the custom function. I will edit my answer in the next hour to include both that and an alternative method. Sorry.
@yonabout Custom function included now.
thanks so much for this. I'd just been building queries using the built in transformations - hadn't even touched the advanced editor before. Working through your query & the function I'm learning loads about how M works, and how the steps interact with each other - which I hadn't properly appreciated until now. Thanks again

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.