optimizing sql server database

Question

My database has one very large table with over 2 billion rows with 3 columns. Id(uniqueidentity), Type(int, between 0-10. 0 = most used. 10 = least used), Data(Binary data between 1-10MB)

What are some ways I can optimize this database? (primarily select queries)

*Note: I might add a few more columns to this table later (eg: location, date...)

What version and edition are you using? Some ideas would be enterprise edition only. — Martin Smith
– Martin Smith, Commented Dec 8, 2010 at 23:49
Can you provide some kinds of examples on how you query this data? By type? By ID? — Joe
– Joe, Commented Dec 9, 2010 at 0:11
Type a number between 0-10 (most - least) that represents is the likelihood of selecting that row. — Joanne
– Joanne, Commented Dec 9, 2010 at 0:22
If you only need to show Id and Type (e.g. in a list), avoid using SELECT *.... - this will always select everything, including your 10 MB of data..... use SELECT ID, Type FROM ... - that alone should speed up those kind of queries (for e.g. a list) by orders of magnitude! — marc_s
– marc_s, Commented Dec 9, 2010 at 6:43

Mitch Wheat · Accepted Answer · 2010-12-09 00:13:55Z

5

Assuming that the id column is the clustered index key, and assuming that by uniqueidentity you mean uniqueidentifier:

do you need the uniqueidentifier type? Why?
What other alternatives have you considered?
Do you populate the data using sequential GUIDs or not?

GUIDs are a notoriously poor choise for clustered keys. See GUIDs as PRIMARY KEYs and/or the clustering key for a more detailed discussion:

But, a GUID that is not sequential - like one that has it's values generated in the client (using .NET) OR generated by the newid() function (in SQL Server) can be a horribly bad choice - primarily because of the fragmentation that it creates in the base table but also because of its size. It's unnecessarily wide (it's 4 times wider than an int-based identity - which can give you 2 billion (really, 4 billion) unique rows). And, if you need more than 2 billion you can always go with a bigint (8-byte int) and get 2^63-1 rows

Also read Disk space is cheap...That's not the point! as a follow up.

Other than this, you need to do your homework and post the required details for such a question: exact table and index definition, prevalent data access pattern (by key, by range, filters sort order, joins etc etc).

Have you done any work to identify problems so far? If not, start with Waits and Queues, a proven methodology to identify performance bottlenecks. Once you measure and find places that need improvement, we can advise how to improve.

edited Dec 9, 2010 at 0:13

Mitch Wheat

302k44 gold badges482 silver badges552 bronze badges

answered Dec 8, 2010 at 23:56

Remus Rusanu

296k42 gold badges459 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Mitch Wheat Over a year ago

+1. Like the Tripp link: "Disk space is cheap...That's not the point!"

marc_s Over a year ago

+1 gotta love Kim Tripp's insights! GUID's as Clustering Key should be prohibited by SQL Server itself.....

Mitch Wheat · Accepted Answer · 2010-12-09 00:12:56Z

1

Add an Index(es). Decide which column(s) are the most appropriate clustered index.
Decide if storing 10MB of binary data in each (otherwise small) row is a good use of a database

[Updated in response to Remus's comment]

edited Dec 9, 2010 at 0:12

answered Dec 8, 2010 at 23:56

Mitch Wheat

302k44 gold badges482 silver badges552 bronze badges

1 Comment

Remus Rusanu Over a year ago

There are very few scenarios where paritioning can benefit performance, and they almost always revolve around switch-in switch-out data transfer for ETL or for retention/archiving. In general, partitioning will hurt performance. If you think at partition elimination: anything partitioning can do, and index ca do better. choosing a proper cluster index will run circles around partitioning any time from a performance POV. My 2c.

Collectives™ on Stack Overflow

optimizing sql server database

2 Answers 2

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related