Combinational SQL count query in Postgres

Question

I have some stations stored in a simple stations collection:

+----+-----------+
| id | name      |
+----+-----------+
| 1  | Station A |
+----+-----------+
| 2  | Station B |
+----+-----------+
| 3  | Station C |
+----+-----------+
| 4  | Station D |
+----+-----------+

And I have some rides stored in the rides collection:

+----+---------------+-------------+
| id | fromStationId | toStationId |
+----+---------------+-------------+
| 1  | 3             | 4           |
+----+---------------+-------------+
| 2  | 2             | 1           |
+----+---------------+-------------+
| 3  | 1             | 1           |
+----+---------------+-------------+
| 4  | 3             | 2           |
+----+---------------+-------------+

I would like to create a count list of all inter-station rides between all possible pairs of fromStation names and toStation names with the result looking like this:

[
  {
    "fromStation": "Station A",
    "toStation": "Station A",
    "count": 1196
  },
  {
    "fromStation": "Station A",
    "toStation": "Station B",
    "count": 1
  },
  {
    "fromStation": "Station A",
    "toStation": "Station C",
    "count": 173
  },
]

And so on for all other combinations...

How do I get all possible two-pair combinations of station names and then count the number of rides among them? I'm using the latest version of Postgres.

I assumed that you have a simple station list (a table with one column) and a second table connections which combines the stations... If your layout differs, please show us some sample data... — S-Man
– S-Man, Commented Mar 13, 2021 at 16:47
@S-Man Great answer! However, I don't have a connections table. I only have a simple station list (one column) and nothing else. I have also modified my question to add more clarity :) — Daksh
– Daksh, Commented Mar 14, 2021 at 3:48

Erwin Brandstetter · Accepted Answer · 2021-03-14 06:56:54Z

1

Aggregate rides first, then resolve IDs to names:

SELECT f.name AS from_station, t.name AS to_station, count
FROM  (
   SELECT from_station_id, to_station_id, count(*) AS count
   FROM   rides
   GROUP  BY 1, 2
   ) r
JOIN   stations f ON f.id = r.from_station_id
JOIN   stations t ON t.id = r.to_station_id
ORDER  BY 1, 2;  -- optional order

Of course, that only produces combinations with actual rides. If you need to include combinations without any rides, you need an OUTER join to a Cartesian product of the stations table with itself. Something like:

-- include all combinations (even without rides)
SELECT from_station, to_station, COALESCE(count, 0) AS count
FROM  (
   SELECT from_station_id, to_station_id, count(*) AS count
   FROM   rides
   GROUP  BY 1, 2
   ) r
RIGHT  JOIN (
   SELECT f.id AS from_id, f.name AS from_station
        , t.id AS to_id  , t.name AS to_station
   FROM   stations f CROSS JOIN stations t
   ) s ON  s.from_id = r.from_station_id
      AND  s.to_id   = r.to_station_id
ORDER  BY 1, 2;  -- optional order

Again, it's cheaper to aggregate rides before joining to stations.

To wrap it up as JSON array or records, simply:

SELECT json_agg(sub)
FROM  (
   -- query from above
   ) sub;

db<>fiddle here

edited Mar 14, 2021 at 6:56

answered Mar 14, 2021 at 4:20

Erwin Brandstetter

668k159 gold badges1.2k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Daksh Over a year ago

Thanks so much for your answer! Apologies for changing the schema of the tables yet again (I took another look at my database and realized my mistake). Could you take a look at the updated question?

Erwin Brandstetter Over a year ago

@DakshShah: I adapted accordingly. For your next question, please remember to provide proper information from the start. It's obviously less than ideal to change the nature of the question after answers have been given.

Daksh Over a year ago

Thanks a lot and sorry for the trouble! I'll ensure I do that.

S-Man · Accepted Answer · 2021-03-13 16:41:44Z

demos:db<>fiddle

SELECT 
    c.from_station,
    c.to_station,
    COUNT(*)
FROM stations s1
JOIN stations s2 ON s1.station <> s2.station                                     -- 1
JOIN connections c ON s1.station = c.from_station AND s2.station = c.to_station  -- 2
GROUP BY c.from_station, c.to_station                                            -- 3

Create a self-join on the station table. The <> join condition ensures that same station will not be joined. All other station will be joined on each other, which creates all combinations
Join this result on your connections table using both station columns from the self-join for the from and to points of the connections
Now you can group by from and to and COUNT(*) aggregate this.

If you want to recognize cases where from station equals to station, you can change

JOIN stations s2 ON s1.station <> s2.station

into a simple:

CROSS JOIN stations s2

If you want to get a JSON object as you shown in the question:

SELECT 
    json_agg(connection)                      -- 2
FROM (
    SELECT 
        json_build_object(                    -- 1
            'fromStation', c.from_station,
            'toStation', c.to_station,
            'count', COUNT(*)
        ) as connection
    FROM stations s1
    JOIN stations s2 ON s1.station <> s2.station
    JOIN connections c ON s1.station = c.from_station AND s2.station = c.to_station
    GROUP BY c.from_station, c.to_station
) s

Create your JSON object from your columns which you created above
Aggregate them into a JSON array.

Collectives™ on Stack Overflow

Combinational SQL count query in Postgres

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related