How to query a tree result set in postgres including multiple tables

Question

Let's assume I have the following tables:

drop table if exists city; 
drop table if exists country; 
drop table if exists world; 

create table world (
  id integer PRIMARY KEY, 
  name text not null, 
  population bigint not null
)
;
create table country (
  id integer PRIMARY KEY,
  world_id integer REFERENCES world (id),  
  name text not null, 
  population bigint not null
);

create table city (
  id integer PRIMARY KEY, 
  country_id integer REFERENCES country (id), 
  name text not null, 
  population bigint not null
);

insert into world (id, name, population) values (1, 'World', 7125000000);
insert into country (id, world_id, name, population) values (2, 1, 'Austria', 8000000); 
insert into country (id, world_id, name, population) values (3, 1, 'Poland', 38530000); 
insert into city (id, country_id, name, population) values (4, 2, 'Vienna',  1741000);
insert into city (id, country_id, name, population) values (5, 2, 'Salzburg',  145000);
insert into city (id, country_id, name, population) values (6, 3, 'Warsaw',  1710000);
insert into city (id, country_id, name, population) values (7, 3, 'Stetin',  409000);

So basically a very simplified view on the world using 3 tables that are joined with each other. Now in order to get all the information I need, I could simply execute a query like

select 
    w.name, 
    c.name, 
    ci.name
from 
    world w
    left outer join country c on (w.id = c.world_id)
    left outer join city ci on (c.id = ci.country_id)

which gives me back the necessary data:

 name  |  name   |   name   
-------+---------+----------
 World | Austria | Vienna
 World | Austria | Salzburg
 World | Poland  | Warsaw
 World | Poland  | Stetin

For an small example like this, the data duplication in the result set might be okay. In my real world sample though, the result set is much bigger (200k rows) and much more columns to be queried. As the whole structure reminds of a tree structure, I am wondering, if it is possible to build a result set that looks a bit more like this:

 id    | parent   |  name    | population 
-------+----------+----------+------------
 1     | null     | World    | 7125000000
 2     | 1        | Austria  |    8000000
 3     | 1        | Poland   |   38530000
 4     | 2        | Vienna   |    1741000
 5     | 2        | Salzburg |     145000
 6     | 3        | Warsaw   |    1710000
 7     | 3        | Stetin   |     409000

All ids are unique among the different tables. Having only one table to represent and build the tree result set, I read, one can use with-queries but so far I did not find anything on how to achieve this for multiple tables involved.

user330315 · Accepted Answer · 2016-04-14 21:04:25Z

1

You first need to create a "unified" view over all three tables, e.g. like this:

select id, null, name, population 
from world
union all 
select id, world_id, name, population 
from country
union all
select id, country_id, name, population
from city

This can then be used to build a recursive common table expression that walks the tree:

with recursive regions (id, parent_id, name, population) as (
     select id, null, name, population 
     from world
     union all 
     select id, world_id, name, population 
     from country
     union all
     select id, country_id, name, population
     from city
), region_tree as (
   select id, parent_id, name, population 
   from regions
   where parent_id is null
   union all
   select c.id, c.parent_id, c.name, c.population
   from regions c
     join region_tree p on p.id = c.parent_id
)
select * 
from area_tree;

If you need this kind of query a lot, it might be better to store everything in a single table with a column indicating why kind of "region" the row is (world, country, city). That would also make introducing new "region types" easier (county, states, etc), e.g.:

create table region
(
  id integer primary key, 
  parent_id integer references region,
  name text, 
  population integer,
  type text not null check (type in ('world','country','city'))
);

answered Apr 14, 2016 at 21:04

user330315

Sign up to request clarification or add additional context in comments.

6 Comments

u6f6o Over a year ago

Are these tables still "connected" via foreign key and is it for example possible to query only a subtree as well (let's say everything that hangs under Poland)?

u6f6o Over a year ago

I also thought about having a single table with a discriminator but regretfully I cannot make such a schema change on our prod system easily.

user330315 Over a year ago

@u6f6o: to query just a subtree, you can change the condition on the non-recursive part of the second CTE, e.g. where name = 'Poland' instead of where parent_id is null

u6f6o Over a year ago

Two last questions, if you don' mind: do I have to manually delete the view again in order to make it go away? Is this approach a cheap or non-so cheap operation performance wise?

u6f6o Over a year ago

I realized that the first part (union all is quite fast). Took me about 100 ms for a subset of 250k rows, adding with recursive though downgraded the performance to 4500 ms.

|

Collectives™ on Stack Overflow

How to query a tree result set in postgres including multiple tables

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related