2

Kusto allows me to create summarize statistics sliced on some column based on the top on rows of a table ordered by some rule. For example, if I want to compute the average Score of each Location using the last 100 rows, I can write

T | top 100 by Time | summarize avg(Score) by Location

But I want the top to apply to each bin of the summarize. For example, I want to compute the average of the last 10 Scores available for each Location. I don't see how I can do this. At first I thought I might be able to use top-nested by it only returns the top n Locations; I want to return records for all Locations, and use the top n of each to compute an aggregate. I am willing to do this via something that effectively repeats a query for each Location, but I need the final result to be in one table with columns Location and AvgOfLast10Scores.

1
  • It's hard to understand what you're trying to achieve. Please edit your question to include a short sample input (in datatable format), and expected output. Commented Jan 16, 2021 at 7:18

2 Answers 2

2

I recommend using partition by then summarize:

let T = datatable(Timestamp:datetime, Location:string, Score: int) 
[
    "01/01/2021", "Texas", 1,
    "01/02/2021", "Texas", 2,
    "01/03/2021", "Texas", 4,
    "01/03/2021", "Alabama", 1,
    "01/04/2021", "Alabama", 3,
    "01/05/2021", "Alabama", 4,
];
T
| partition by Location (top 2 by Timestamp)
| summarize avg(Score) by Location

Results:

query results

You might also be able to use top-nested, but I find top-nested confusing. You can read more about partition in the MS documentation

Sign up to request clarification or add additional context in comments.

2 Comments

This is simpler than the other solution but currently works only if the number of distinct locations is 64 or less.
Thanks for this, and thanks to @Avnera for calling out the 64 row limitation.
1

Here is one option (in this example it will take the last two measurements):

datatable(Time:datetime, Location: string, Score:int)[datetime(2021-01-10), "a", 6, 
    datetime(2021-01-10 01:00), "a", 7,
    datetime(2021-01-10 02:00), "a", 8,
    datetime(2021-01-10 03:00), "a", 10,
    datetime(2021-01-10), "b", 10,
    datetime(2021-01-10 01:00), "b", 20,
    datetime(2021-01-10 02:00), "b", 30,
    datetime(2021-01-10 03:00), "b", 40]
| order by Location asc, Time desc 
| extend rowNumber = row_number(0, Location != prev(Location))
| where rowNumber <= 1
| summarize avg(Score) by Location

And the results: enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.