6

I came across this very odd situation, and i thought i would throw it up to the crowd to find out the WHY.

I have a query that was joining a table on a linked server:

select a.*, b.phone
from table_a a, 
join remote.table_b b on b.id = a.id
 (lots of data on A, but very few on B)

this query was talking forever (never even found out the actual run time), and that is when I noticed B had no index, so I added it, but that didn't fix the issue. Finally, out of desperation I tried:

select a.*, b.phone
from table_a a, 
join (select id, phone from remote.B) as b on b.id = a.id

This version of the query, in my mind as least, should have the same results, but lo and behold, its responding immediately!

Any ideas why one would hang and the other process quickly? And yes, I did wait to make sure the index had been built before running both.

6
  • Have you looked at the execution plan for both? Do they differ? Did you issue a dbcc dropcleanbuffers between each method? Commented Jan 25, 2012 at 20:22
  • Roughly how many records are in remote.B? Commented Jan 25, 2012 at 20:22
  • The "remote" DB is on another server accessed via a linkserver and there are about 600 rows. Commented Jan 25, 2012 at 20:25
  • Please show both execution plans. Without that these are just guesses. Commented Jan 25, 2012 at 20:29
  • Are there more columns in table_b than just id and phone? Commented Jan 25, 2012 at 20:29

4 Answers 4

4

It's because sometimes(very often) execution plans automatically generated by sql server engine are not as good and obvious as we want to. You can look at execution plan in both situations. I suggest use hint in first query, something like that: INNER MERGE JOIN.

Here is some more information about that:

http://msdn.microsoft.com/en-us/library/ms181714.aspx

Sign up to request clarification or add additional context in comments.

5 Comments

This is a nice new thing I have learned today! I know some other places I can use this kind of merge.
I'm using merge hint to join view (created from 3 tables) and table. View already has executuion plan but even that engine creates new, much much slower. That is another case when you can find it useful.
MERGE isn't a magic "go faster" hint. If you aren't getting the right plan it is better to first understand why.
Of course. I agree with that. I looked at execution plan and I found that he was joining first one big table from view with table not from view and then rest which is completly odd.
I agree, I don't think you can use this everywhere either, but in my situation I knew that this would solve the problem as soon as i read what it would do (there were a lot of other tables I was not showing in examples that could change the execution plan greatly.
3

For linked servers 2nd variant prefetches all the data locally and do the join, since 1st variant may do inner loop join roundtrip to linked server for every row in A

5 Comments

@Limey for 1st variant look at Number of executions in Query plan for remote call - if it more than 1- then it does roundtripd for data
Checked the estimated execution plan; its no inner loop joins going on for that table.
@Liney - What is the Number of executions for remote table item? Or show us the whole query plan as XML
The number of executions for the remote table is 1
@Liney How about XML query plan?
1

Remote table as in not on that server? Is it possible that the join is actually making multiple calls out to the remote table while the subquery is making a single request for a copy of the table data, thus resulting in less time waiting on network?

6 Comments

~600 rows in the remote table. Find something to support the idea that the subquery is resulting in the entire table being cached and this gets my +1
The estimated execution plan shows it only accessing it once in both versions of the query. The bad one is estimated using 40% of the processing, while the 2nd one is 1%
Limey, what is the exact wording in the execution plan? I would be surprised if it actually indicates the number of calls that were made internal to the process.
when using a join, its estimate number of executions is 1
@Limey - What is the join type? Merge? Hash? Nested Loops? If you do a Diff of the XML plans what is different between the two?
|
1

I'm just going to have a guess here. When you access remote.b is it a table on another server?

If it is, the reason the second query is faster is because, you do one query to the other server and get all the fields you need from b, before processing the data. In the first query you are processing data and at the same time you are making several requests to the other server.

Hope this help you.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.