SQL Primary key sort order

Question

I am using MS SQL server 2005

I have a table with 3 columns where I store user-message mapping like:

msg_for msg_from msg_id 
bob     bob      1 
bob     john     1 
bob     steve    1 
bob     bob      2 
bob     john     2 
bob     bob      3 
bob     john     3 
bob     steve    3

The PK is on 3 columns and msg_id is FK to messages table that stores the messages

The above is the physical storage I see according to the PK on 3 columns

Now my query MUST return the messages for a given user having latest msg at top (order by msg_id DESC)

bob john  3
bob steve 3
bob john  2
bob steve 2
bob john  1
bob steve 1

This mapping table has millions of rows. I see 95% of the cost is to SORT the result.

Is it possible to have the PK or some other way store data physically like this (avoid SORT)?

msg_for msg_from msg_id
bob     bob      3
bob     john     3
bob     steve    3
bob     bob      2
bob     john     2
bob     bob      1
bob     john     1
bob     steve    1

Thanks

Are you saying msg_for and msg_from columns are of varchar type? — CyberDude
– CyberDude, Commented Sep 3, 2010 at 19:52

Martin Smith · Accepted Answer · 2010-09-03 20:49:14Z

4

Yes.

When you set up the Primary Key (or any index) you can define this

ALTER TABLE dbo.[Messages] ADD CONSTRAINT [PK_Messages] PRIMARY KEY CLUSTERED 
(
    msg_for ASC, msg_from ASC, msg_id DESC
)

SQL Server can scan in either direction so it only makes sense if you want to control the sort order combination for multiple columns.

Edit: You say in the comments that the problem query is

select top 10 msg_id 
from message_user 
where msg_for = @user_name 
and msg_from <> @user_name 
order by msg_id DESC

The issue here isn't one of Ascending, Descending.

To give an analogy. Phone books are listed in surname, forename order but if you needed to know the lexicographically last 10 forenames in the directory you would need to scan the whole book. This would be unavoidable regardless of whether or not within each section forenames were listed in ascending or descending order.

Similarly the composite index keys would need to be msg_for, msg_id, msg_from to satisfy this query optimally not msg_for, msg_from, msg_id With this latter order it will still need to scan the whole section of the index satisfying the msg_for = @user_name criteria as it cannot know if there will be a later msg_id still to come belonging to a later msg_from Additionally regardless in which direction msg_id is sorted in their individual sub sections a sequential scan of the msg_for = @user_name part of the index will still require a sort as it they are fragmented by being in subsections according to msg_from.

edited Sep 3, 2010 at 20:49

answered Sep 3, 2010 at 19:57

Martin Smith

457k97 gold badges777 silver badges886 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

kheya Over a year ago

Are there any cons of this alteration, if any?

Martin Smith Over a year ago

If you are going to alter the clustered PK on a large table you will need a maintenance window to do this in. You'd have to consider adverse effects on other queries that rely on the existing order. Can you post the actual query with this 'sort' operator?

kheya Over a year ago

This is the query that causes 95% cost on sorting: select top 10 msg_id from message_user where msg_for = @user_name and msg_from <> @user_name order by msg_id DESC Other queries: All of my queries on this table MUST return rows with msg_id in DESC order

Martin Smith Over a year ago

@Projapati - The issue here isn't one of Ascending, Descending. It is that the composite index keys would need to be msg_for, msg_id, msg_from to satisfy this query optimally not msg_for, msg_from, msg_id

CyberDude Over a year ago

Even so, the fact that the key columns are of varchar type has a large impact on performance. Even if the index would have the columns in the right order it's very resource intensive for the DBMS engine to keep it sorted with every table change because it needs to sort strings.

|

OMG Ponies · Accepted Answer · 2010-09-03 19:57:46Z

3

The only way to guarantee order in the result set is by using an ORDER BY.

In SQL Server, a clustered index can help... assuming the optimizer sees the index as being useful.

answered Sep 3, 2010 at 19:57

OMG Ponies

334k85 gold badges536 silver badges508 bronze badges

Comments

CyberDude · Accepted Answer · 2010-09-03 20:05:33Z

1

Well no wonder sorting takes forever. Varchar/string types are usually types that are very heavy when it comes to sorting, whether it's SQL or any programming language for that matter. Whenever possible use integral types for such things.

I suggest you use integral values to identify members. Have a Members table (MemberId INT, MemberName VARCHAR, etc), then a Messages table (MessageId INT, MessageBody VARCHAR, etc) and then have a join table, say Correspondence with (SenderMemberId INT, RecipientMemberId INT, MessageId INT). Sorting on integral values will be way faster this way.

I think you can easily refactor your data to fit on such a new structure.

answered Sep 3, 2010 at 20:05

CyberDude

9,2195 gold badges33 silver badges51 bronze badges

2 Comments

kheya Over a year ago

That will force me to join another table to see who is the user for that given id. I want to avoid join.

CyberDude Over a year ago

Joins are not really something to run away from. Of course having too many of them can have downsides but in the majority of cases they help a lot. Storing names as keys is an awful waste of storage space, not to mention redundancy (what if a member wishes to change his name?) Do a benchmark, see how the performance of the two methods compare.

Frank · Accepted Answer · 2010-09-03 19:47:55Z

0

Depending on your DBMS, you can use a clustered index to achieve that.

answered Sep 3, 2010 at 19:47

Frank

2,64816 silver badges14 bronze badges

3 Comments

kheya Over a year ago

As I said I have a PK on 3 columns. So I have the clusted index already

kheya Over a year ago

But the clusted index stores rows in msg_id in 1,2,3..not 3,2,1

Frank Over a year ago

When creating the index, you can in most DBMS add a DESC after a column to note descending instead of ascending order, and you can state the column order as well.

Collectives™ on Stack Overflow

SQL Primary key sort order

4 Answers 4

10 Comments

Comments

2 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

10 Comments

Comments

2 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related