4

I have a small query below where it outputs a row number under the RowNumber column based on partitioning the 'LegKey' column and ordering by UpdateID desc. This is so the latest updated row (UpdateID) per legkey is always number 1

SELECT *
, ROW_NUMBER() OVER(PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber 
FROM Data.Crew

Data outputted:

UpdateID    LegKey  OriginalSourceTableID   UpdateReceived          RowNumber
7359        6641    11                     2016-08-22 16:35:27.487  1
7121        6641    11                     2016-08-15 00:00:47.220  2
8175        6642    11                     2016-08-22 16:35:27.487  1
7122        6642    11                     2016-08-15 00:00:47.220  2
8613        6643    11                     2016-08-22 16:35:27.487  1
7123        6643    11                     2016-08-15 00:00:47.220  2

The problem I have with this method is that I am getting slow performance because I assume I am using the ORDER BY.

My question is that is there an alternative way to produce a similar result but have my query run faster? I am thinking a MAX() may work but I didn't get the same output as before. Maybe I did the MAX() statement incorrectly so was wondering if this is a good alternative if somebody can provide an example on how they would write the MAX() statement for this example?

Thank you

9
  • 2
    What indices do you have on this table? Commented Oct 30, 2016 at 23:01
  • @Cory I have a non unique, non clustered index on the LegKey column and a clustered index on the primary key UpdateID. I don't have the rights to manipulate these indexes in the tables just to let you know Commented Oct 30, 2016 at 23:07
  • you may refer to my answer stackoverflow.com/questions/39933458/… here I explained four different methods similar to your case. The MaxDate is shown in Method 4 & 2. Commented Oct 30, 2016 at 23:08
  • @AhmedSaeed If you don't mind is it ok if you apply method 4 and 2 in an answer using my example just so I 100% know how it works if you don't mind? Commented Oct 30, 2016 at 23:14
  • 1
    In my experience, ROW_NUMBER is faster than any other method. So I suggest you first test the other methods and see if anything is obviously faster (doubtful). Then if you want better performance you need to take a look at the query plan and yes you will most likely need to apply or change indexes. Is this really your full query, or are you also filtering on other things and/or filtering on the result of ROW_NUMBER? Commented Oct 30, 2016 at 23:32

2 Answers 2

5

Presumably this is the query you want to optimize:

SELECT c.*
FROM (SELECT c.*,
             ROW_NUMBER() OVER (PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber 
      FROM Data.Crew c
     ) c
WHERE RowNumber = 1;

Try an index on Crew(LegKey, UpdateId).

This index will also be used if you do:

SELECT c.*
FROM Data.Crew c
WHERE c.UpdateId = (SELECT MAX(c2.UpdateId)
                    FROM Data.Crew c2
                    WHERE c2.LegKey = c.LegKey
                   );
Sign up to request clarification or add additional context in comments.

Comments

2

You can try one of the following:

declare @Table table(UpdateID int,   LegKey int,  OriginalSourceTableID int,  UpdateReceived datetime)

Here using the MAX Date in subquery.

select * from @Table as a where a.UpdateReceived = (Select MAX(UpdateReceived) from @Table as b Where b.LegKey = a.LegKey)

Here you can use it in cte with group by.

with MaxDate as( Select LegKey, Max(UpdateReceived) as MaxDate from @Table group by LegKey ) 
select * from MaxDate as a   
inner join @Table as b 
     on b.LegKey=a.LegKey 
    and b.UpdateReceived=a.MaxDate

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.