2

Let's assume I have the following tables:

drop table if exists city; 
drop table if exists country; 
drop table if exists world; 

create table world (
  id integer PRIMARY KEY, 
  name text not null, 
  population bigint not null
)
;
create table country (
  id integer PRIMARY KEY,
  world_id integer REFERENCES world (id),  
  name text not null, 
  population bigint not null
);

create table city (
  id integer PRIMARY KEY, 
  country_id integer REFERENCES country (id), 
  name text not null, 
  population bigint not null
);

insert into world (id, name, population) values (1, 'World', 7125000000);
insert into country (id, world_id, name, population) values (2, 1, 'Austria', 8000000); 
insert into country (id, world_id, name, population) values (3, 1, 'Poland', 38530000); 
insert into city (id, country_id, name, population) values (4, 2, 'Vienna',  1741000);
insert into city (id, country_id, name, population) values (5, 2, 'Salzburg',  145000);
insert into city (id, country_id, name, population) values (6, 3, 'Warsaw',  1710000);
insert into city (id, country_id, name, population) values (7, 3, 'Stetin',  409000);

So basically a very simplified view on the world using 3 tables that are joined with each other. Now in order to get all the information I need, I could simply execute a query like

select 
    w.name, 
    c.name, 
    ci.name
from 
    world w
    left outer join country c on (w.id = c.world_id)
    left outer join city ci on (c.id = ci.country_id)

which gives me back the necessary data:

 name  |  name   |   name   
-------+---------+----------
 World | Austria | Vienna
 World | Austria | Salzburg
 World | Poland  | Warsaw
 World | Poland  | Stetin

For an small example like this, the data duplication in the result set might be okay. In my real world sample though, the result set is much bigger (200k rows) and much more columns to be queried. As the whole structure reminds of a tree structure, I am wondering, if it is possible to build a result set that looks a bit more like this:

 id    | parent   |  name    | population 
-------+----------+----------+------------
 1     | null     | World    | 7125000000
 2     | 1        | Austria  |    8000000
 3     | 1        | Poland   |   38530000
 4     | 2        | Vienna   |    1741000
 5     | 2        | Salzburg |     145000
 6     | 3        | Warsaw   |    1710000
 7     | 3        | Stetin   |     409000

All ids are unique among the different tables. Having only one table to represent and build the tree result set, I read, one can use with-queries but so far I did not find anything on how to achieve this for multiple tables involved.

1 Answer 1

1

You first need to create a "unified" view over all three tables, e.g. like this:

select id, null, name, population 
from world
union all 
select id, world_id, name, population 
from country
union all
select id, country_id, name, population
from city

This can then be used to build a recursive common table expression that walks the tree:

with recursive regions (id, parent_id, name, population) as (
     select id, null, name, population 
     from world
     union all 
     select id, world_id, name, population 
     from country
     union all
     select id, country_id, name, population
     from city
), region_tree as (
   select id, parent_id, name, population 
   from regions
   where parent_id is null
   union all
   select c.id, c.parent_id, c.name, c.population
   from regions c
     join region_tree p on p.id = c.parent_id
)
select * 
from area_tree;

If you need this kind of query a lot, it might be better to store everything in a single table with a column indicating why kind of "region" the row is (world, country, city). That would also make introducing new "region types" easier (county, states, etc), e.g.:

create table region
(
  id integer primary key, 
  parent_id integer references region,
  name text, 
  population integer,
  type text not null check (type in ('world','country','city'))
);
Sign up to request clarification or add additional context in comments.

6 Comments

Are these tables still "connected" via foreign key and is it for example possible to query only a subtree as well (let's say everything that hangs under Poland)?
I also thought about having a single table with a discriminator but regretfully I cannot make such a schema change on our prod system easily.
@u6f6o: to query just a subtree, you can change the condition on the non-recursive part of the second CTE, e.g. where name = 'Poland' instead of where parent_id is null
Two last questions, if you don' mind: do I have to manually delete the view again in order to make it go away? Is this approach a cheap or non-so cheap operation performance wise?
I realized that the first part (union all is quite fast). Took me about 100 ms for a subset of 250k rows, adding with recursive though downgraded the performance to 4500 ms.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.