3

I'm having trouble understanding the different join algorithms (nested-loop join, merge join, index join, hash join, and other variations) and how/when to use them. More specifically, I've been asked to draw a query tree for the most efficient execution of the following query:

SELECT E.Name
FROM Employee E, Department D, Works_On W, Project P
WHERE E.DNO = D.DNO and E.SSN = W.ESSN and P.PNUM = W.PNUM and
    P.Budget > 50 and E.Sex = 'M' and E.Hobby = 'Yodeling' and
    D.DName = 'Rational Mechanics';

I can provide the schema if it's needed; basically, the four tables are

Employee (SSN, Name, DNO - Department number, Salary, Sex),
Department (DNO, DName, Budget, Location, MGRSSN),
Works_On (ESSN, PNum - Project number),
Project (PNum, PName, Budget, Location, Goal).

I've drawn a left deep join tree and I don't know which algorithm to use for each join. If I could get an explanation of when to use each algorithm or a pointer to a resource that explains it, that would be very helpful.

Edit: I'm not asking about how to specify different joins in sql, only about the joins in general. Also, I was not told that any of the tables were indexed, but I was told that I can index just to do an index join. I was also given statistics to make the query tree heuristically, which I used to make the structure of the tree.

4
  • 1
    I don't entirely understand your question. Do you just want to understand the different joining methods, or do you want to learn how to force a given SQL engine to use a better/more efficient joining method, or do you have to write your own query engine? BTW, you should add the available indexes to your schema, because it influences greatly the final result. Or maybe I'm a helicopter. Commented Dec 10, 2012 at 19:50
  • msdn is a good resource for this info. technet.microsoft.com/en-us/library/ms191426(v=sql.105).aspx Commented Dec 10, 2012 at 20:43
  • You could start your learning of joins by using proper join syntax, instead of having implicit joins, with the conditions in the where clause. Commented Dec 10, 2012 at 20:44
  • Thanks Zeph - that was what I needed. I have low rep, but I'm going to close this when I can. Commented Dec 10, 2012 at 23:57

2 Answers 2

3

Generally, you want to leave it to your database system to pick the best query execution plan and thus the best join algorithms. It depends on the available indices and statistical data about the tables (such as how many rows they contain and how many different values a row contains). Furthermore, the join algorithms also depend on database system specific things such as what algorithms are implemented, whether the data is stored using clustered or index-organized tables etc.

Only if your query is unusually slow or has been identified as an applications bottleneck will you try to influence the execution plan.

Based on the available information, it's not possible to tell the best execution plan.

Sign up to request clarification or add additional context in comments.

1 Comment

It also depends on the hardware (number of processors and memory configuration) as well as information about the data.
1

msdn is a good resource for this info. technet.microsoft.com/en-us/library/ms191426(v=sql.105).aspx

is exactly what I needed. Thanks, Zeph!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.