2

What is the fastest way to fetch data from two tables if one table is referenced from another one in multiple columns?

Consider a table with company names and a table with contracts. Each contract can have a client, an intermediary, and a contractor - in every combination. Each value may be null and the same company may be one, two, or three times in each contract row.

The table definitions are:

CREATE TABLE company (id integer,name text);

CREATE TABLE contract (id integer, client integer, intermediary integer, contractor integer);

I've created a SQL fiddle with the test da below: https://www.db-fiddle.com/f/irCodeZjeEPWvhmRwMcHqT/0

Test data:

INSERT INTO company (id,name) VAlUES (1,'Company 1');
INSERT INTO company (id,name) VAlUES (2,'Company 2');
INSERT INTO company (id,name) VAlUES (3,'Company 3');
INSERT INTO company (id,name) VAlUES (4,'Company 4');
INSERT INTO company (id,name) VAlUES (5,'Company 5');
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (1,NULL,NULL,NULL);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (2,NULL,2,3);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (3,1,NULL,NULL);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (4,NULL,2,NULL);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (5,1,2,3);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (6,4,NULL,5);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (7,1,NULL,1);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (7,3,3,3);

Now, using PostgreSQL 9.6, a query is needed which returns the contract id with the name of each company involved. Pretty easy with subqueries:

SELECT
id, 
(SELECT name FROM company WHERE id = client) AS "clientName",
(SELECT name FROM company WHERE id = intermediary) AS "intermediaryName",
(SELECT name FROM company WHERE id = contractor) AS "contractorName"
FROM contract;

However, in real world, with a much more complex query, we are getting into performance problems here. The question is now: Is there a way to improve it? Would a JOIN be faster than subqueries? If yes: How would that even work?

Of course, you could do something like

SELECT * FROM contract LEFT JOIN company ON company.id = ANY(ARRAY[client,contractor,intermediary]);,

but in this case, the information which company plays which role in the contract gets lost.

(Edit: In real world, there are indexes, foreign key constraints and stuff. I've left all that aside here for brevity.)

1
  • 1
    My current generic comment re "better"/"best" etc: There's no such thing as "better"/"best" in engineering unless you define it. Also unfortunately all reasonable practical definitions require a ridiculous amount of experience with a ridiculous number of factors that interact with chaotic sensitivity to details. Make straightforward designs. When you demonstrate via measurement that a design and all alternatives you can think of have problems (whatever that means at the time), then ask a very specific question. Which should also define "better"/"best". meta.stackexchange.com/q/204461 Commented Oct 4, 2018 at 22:49

2 Answers 2

1

Your method is fine, although you should use table aliases:

SELECT id, 
       (SELECT c.name FROM company c WHERE c.id = co.client) AS "clientName",
       (SELECT c.name FROM company c WHERE c.id = co.intermediary) AS "intermediaryName",
       (SELECT c.name FROM company c WHERE c.id = co.contractor) AS "contractorName"
FROM contract co;

id should be the primary key in company -- or have an index built on it.

You can express this using left join as well:

SELECT id, cc.name as clientName, ci.name as intermediaryName, cco.name as contractorName
FROM contract co LEFT JOIN
     company cc
     ON c.id = co.client LEFT JOIN
     company ci
     ON ci.id = co.intermediary LEFT JOIN
     company cco
     ON cco.id = co.contractor;

The performance should be pretty similar between the two methods.

Sign up to request clarification or add additional context in comments.

Comments

0

Gordon's solution looks fine to me (especially the second one, with the outer joins).

Did you add foreign keys and indexes on the columns Client, Intermediary and Contractor in table Contract?

1 Comment

Yes, have indexes and foreign keys. Just left that aside here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.