1

I'm having trouble trying to optimize the following query for sql server 2005. Does anyone know how could I improve it. Each one of the tables used there have about 40 million rows each. I've tried my best trying to optimize it but I manage to do the exact opposite.

Thanks

SELECT
        cos
      , SIN
    FROM
        ConSisHis2005
    union all
    SELECT
        cos
      , SIN
    FROM
        ConSisHis2006
    union all
    SELECT
        cos
      , SIN
    FROM
        ConSisHis2007
    UNION ALL
    SELECT
        cos
      , SIN
    FROM
        ConSisHis2008

Maybe I should have said something else about the schema, all the tables used here are historical tables, they are not referenced to any other table. And theres already an index for cos and SIN. I was just wondering if there was any other way to optimize the query... as you can imagine 160millon records are hard to get :s

4
  • Optimizing queries is usually impossible without knowing the schema and what you are trying to achieve. Commented Nov 27, 2008 at 14:17
  • do you have/need duplicate entry? maybe you could filter that to get less rows? Commented Nov 27, 2008 at 14:39
  • no, there are no duplicate records in the tables. Even if there would be duplicate tables, getting rid of them would be even more expensive Commented Nov 27, 2008 at 14:42
  • What kind of a report would possibly need 160 million rows in it, with no totals, no groups, no sorting - what use would it possibly be? Commented Nov 28, 2008 at 5:39

7 Answers 7

2

It seems that the query is just combining the separated history tables into a single result set containing all the data. In that case the query is already optimal.

Sign up to request clarification or add additional context in comments.

Comments

2

Another approach would be to tackle the problem of why do you need to have all the 160 million rows? If you are doing some kind of reporting can you create separate reporting tables that already have some of the data aggregated. Or do you actually need a data warehouse to support your reporting needs.

1 Comment

i need those 160 million rows for reporting purposes, but its quite clear i'll have to come up with a different approach. I was trying to avoid having to do this, but i guess i'll have to...getting those 160 million rows fast is hard :p thanks
1

Put a composite index on cos and sin on each of the tables. That's as good as you're going to get without restructuring the table design (in this example, it looks like you should have just 1 table to begin with)

4 Comments

How would the indexes help since he's not filtering by anything.
Since you're only selecting those two columns then SQL server can just get the data directly from composite index instead of having to hit the actual data record. Will only be an improvement if you have other columns in your tables.
Thanks for the answer unfortunately the tables already have a composite index in the columns im fetching.
Then you really need to back up and ask yourself if the architecture you have selected for this table design/report logic is what you really want...
1

Since there is no WHERE clause, I don't believe there's anything you can do to improve the performance from this PoV.

You've correctly used UNION ALL so there's no help there.

The only other thing I can think of is whether there are more columns on the tables? If so, you might be fetching more data from disk than you need, thus slowing the query down.

1 Comment

the tables have about 10 more columns, the table has a composite index on those to columns... it looks it won't get any better than this thanks for the answer.
1

It might be worth experimenting with indexed views. You could put the above statement into a view with the indexes Dave suggested. This would take a little time to build initially but would return your results a little quicker (this is on the assumption that the data set does not change much and therefore you can live with the extra transactional overhead).

1 Comment

im going to try different indexes, but i don't think i'll experience any big improvements. thank you
0

You might consider using a single partitioned table with a year indicator.

I'm still curious - is this code in a view or SP which operates on 160m rows or is it actually going to return 160m rows down the wire. If so, that's an awful lot of data to return that's effectively an extract and it's going to take a while just to come down the wire.

Comments

0

There's no optimization to be done. Since you're selecting all the records from all the tables, by definition you get all the records from all the tables in one result set.

What's the reason for doing this?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.