PostgreSQL: JOIN if a table is referenced more than once

Question

What is the fastest way to fetch data from two tables if one table is referenced from another one in multiple columns?

Consider a table with company names and a table with contracts. Each contract can have a client, an intermediary, and a contractor - in every combination. Each value may be null and the same company may be one, two, or three times in each contract row.

The table definitions are:

CREATE TABLE company (id integer,name text);

CREATE TABLE contract (id integer, client integer, intermediary integer, contractor integer);

I've created a SQL fiddle with the test da below: https://www.db-fiddle.com/f/irCodeZjeEPWvhmRwMcHqT/0

Test data:

INSERT INTO company (id,name) VAlUES (1,'Company 1');
INSERT INTO company (id,name) VAlUES (2,'Company 2');
INSERT INTO company (id,name) VAlUES (3,'Company 3');
INSERT INTO company (id,name) VAlUES (4,'Company 4');
INSERT INTO company (id,name) VAlUES (5,'Company 5');
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (1,NULL,NULL,NULL);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (2,NULL,2,3);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (3,1,NULL,NULL);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (4,NULL,2,NULL);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (5,1,2,3);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (6,4,NULL,5);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (7,1,NULL,1);
INSERT INTO contract (id,client,intermediary,contractor) VAlUES (7,3,3,3);

Now, using PostgreSQL 9.6, a query is needed which returns the contract id with the name of each company involved. Pretty easy with subqueries:

SELECT
id, 
(SELECT name FROM company WHERE id = client) AS "clientName",
(SELECT name FROM company WHERE id = intermediary) AS "intermediaryName",
(SELECT name FROM company WHERE id = contractor) AS "contractorName"
FROM contract;

However, in real world, with a much more complex query, we are getting into performance problems here. The question is now: Is there a way to improve it? Would a JOIN be faster than subqueries? If yes: How would that even work?

Of course, you could do something like

SELECT * FROM contract LEFT JOIN company ON company.id = ANY(ARRAY[client,contractor,intermediary]);,

but in this case, the information which company plays which role in the contract gets lost.

(Edit: In real world, there are indexes, foreign key constraints and stuff. I've left all that aside here for brevity.)

My current generic comment re "better"/"best" etc: There's no such thing as "better"/"best" in engineering unless you define it. Also unfortunately all reasonable practical definitions require a ridiculous amount of experience with a ridiculous number of factors that interact with chaotic sensitivity to details. Make straightforward designs. When you demonstrate via measurement that a design and all alternatives you can think of have problems (whatever that means at the time), then ask a very specific question. Which should also define "better"/"best". meta.stackexchange.com/q/204461 — philipxy
– philipxy, Commented Oct 4, 2018 at 22:49

Gordon Linoff · Accepted Answer · 2018-10-04 13:44:51Z

1

Your method is fine, although you should use table aliases:

SELECT id, 
       (SELECT c.name FROM company c WHERE c.id = co.client) AS "clientName",
       (SELECT c.name FROM company c WHERE c.id = co.intermediary) AS "intermediaryName",
       (SELECT c.name FROM company c WHERE c.id = co.contractor) AS "contractorName"
FROM contract co;

id should be the primary key in company -- or have an index built on it.

You can express this using left join as well:

SELECT id, cc.name as clientName, ci.name as intermediaryName, cco.name as contractorName
FROM contract co LEFT JOIN
     company cc
     ON c.id = co.client LEFT JOIN
     company ci
     ON ci.id = co.intermediary LEFT JOIN
     company cco
     ON cco.id = co.contractor;

The performance should be pretty similar between the two methods.

answered Oct 4, 2018 at 13:44

Gordon Linoff

1.3m62 gold badges705 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

johey · Accepted Answer · 2018-10-04 13:55:00Z

0

Gordon's solution looks fine to me (especially the second one, with the outer joins).

Did you add foreign keys and indexes on the columns Client, Intermediary and Contractor in table Contract?

answered Oct 4, 2018 at 13:55

johey

1,2181 gold badge11 silver badges27 bronze badges

1 Comment

cis Over a year ago

Yes, have indexes and foreign keys. Just left that aside here.

Collectives™ on Stack Overflow

PostgreSQL: JOIN if a table is referenced more than once

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related