Query optimization - join algorithms

Question

I'm having trouble understanding the different join algorithms (nested-loop join, merge join, index join, hash join, and other variations) and how/when to use them. More specifically, I've been asked to draw a query tree for the most efficient execution of the following query:

SELECT E.Name
FROM Employee E, Department D, Works_On W, Project P
WHERE E.DNO = D.DNO and E.SSN = W.ESSN and P.PNUM = W.PNUM and
    P.Budget > 50 and E.Sex = 'M' and E.Hobby = 'Yodeling' and
    D.DName = 'Rational Mechanics';

I can provide the schema if it's needed; basically, the four tables are

Employee (SSN, Name, DNO - Department number, Salary, Sex),
Department (DNO, DName, Budget, Location, MGRSSN),
Works_On (ESSN, PNum - Project number),
Project (PNum, PName, Budget, Location, Goal).

I've drawn a left deep join tree and I don't know which algorithm to use for each join. If I could get an explanation of when to use each algorithm or a pointer to a resource that explains it, that would be very helpful.

Edit: I'm not asking about how to specify different joins in sql, only about the joins in general. Also, I was not told that any of the tables were indexed, but I was told that I can index just to do an index join. I was also given statistics to make the query tree heuristically, which I used to make the structure of the tree.

I don't entirely understand your question. Do you just want to understand the different joining methods, or do you want to learn how to force a given SQL engine to use a better/more efficient joining method, or do you have to write your own query engine? BTW, you should add the available indexes to your schema, because it influences greatly the final result. Or maybe I'm a helicopter. — biziclop
– biziclop, Commented Dec 10, 2012 at 19:50
msdn is a good resource for this info. technet.microsoft.com/en-us/library/ms191426(v=sql.105).aspx — Zeph
– Zeph, Commented Dec 10, 2012 at 20:43
You could start your learning of joins by using proper join syntax, instead of having implicit joins, with the conditions in the where clause. — Gordon Linoff
– Gordon Linoff, Commented Dec 10, 2012 at 20:44
Thanks Zeph - that was what I needed. I have low rep, but I'm going to close this when I can. — Kristine S
– Kristine S, Commented Dec 10, 2012 at 23:57

Codo · Accepted Answer · 2012-12-10 20:23:03Z

3

Generally, you want to leave it to your database system to pick the best query execution plan and thus the best join algorithms. It depends on the available indices and statistical data about the tables (such as how many rows they contain and how many different values a row contains). Furthermore, the join algorithms also depend on database system specific things such as what algorithms are implemented, whether the data is stored using clustered or index-organized tables etc.

Only if your query is unusually slow or has been identified as an applications bottleneck will you try to influence the execution plan.

Based on the available information, it's not possible to tell the best execution plan.

answered Dec 10, 2012 at 20:23

Codo

79.7k18 gold badges176 silver badges218 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Gordon Linoff Over a year ago

It also depends on the hardware (number of processors and memory configuration) as well as information about the data.

Kristine S · Accepted Answer · 2012-12-11 04:08:33Z

1

msdn is a good resource for this info. technet.microsoft.com/en-us/library/ms191426(v=sql.105).aspx

is exactly what I needed. Thanks, Zeph!

answered Dec 11, 2012 at 4:08

Kristine S

3332 silver badges8 bronze badges

Collectives™ on Stack Overflow

Query optimization - join algorithms

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related