0

I have some stations stored in a simple stations collection:

+----+-----------+
| id | name      |
+----+-----------+
| 1  | Station A |
+----+-----------+
| 2  | Station B |
+----+-----------+
| 3  | Station C |
+----+-----------+
| 4  | Station D |
+----+-----------+

And I have some rides stored in the rides collection:

+----+---------------+-------------+
| id | fromStationId | toStationId |
+----+---------------+-------------+
| 1  | 3             | 4           |
+----+---------------+-------------+
| 2  | 2             | 1           |
+----+---------------+-------------+
| 3  | 1             | 1           |
+----+---------------+-------------+
| 4  | 3             | 2           |
+----+---------------+-------------+

I would like to create a count list of all inter-station rides between all possible pairs of fromStation names and toStation names with the result looking like this:

[
  {
    "fromStation": "Station A",
    "toStation": "Station A",
    "count": 1196
  },
  {
    "fromStation": "Station A",
    "toStation": "Station B",
    "count": 1
  },
  {
    "fromStation": "Station A",
    "toStation": "Station C",
    "count": 173
  },
]

And so on for all other combinations...

How do I get all possible two-pair combinations of station names and then count the number of rides among them? I'm using the latest version of Postgres.

2
  • 1
    I assumed that you have a simple station list (a table with one column) and a second table connections which combines the stations... If your layout differs, please show us some sample data... Commented Mar 13, 2021 at 16:47
  • @S-Man Great answer! However, I don't have a connections table. I only have a simple station list (one column) and nothing else. I have also modified my question to add more clarity :) Commented Mar 14, 2021 at 3:48

2 Answers 2

1

Aggregate rides first, then resolve IDs to names:

SELECT f.name AS from_station, t.name AS to_station, count
FROM  (
   SELECT from_station_id, to_station_id, count(*) AS count
   FROM   rides
   GROUP  BY 1, 2
   ) r
JOIN   stations f ON f.id = r.from_station_id
JOIN   stations t ON t.id = r.to_station_id
ORDER  BY 1, 2;  -- optional order

Of course, that only produces combinations with actual rides. If you need to include combinations without any rides, you need an OUTER join to a Cartesian product of the stations table with itself. Something like:

-- include all combinations (even without rides)
SELECT from_station, to_station, COALESCE(count, 0) AS count
FROM  (
   SELECT from_station_id, to_station_id, count(*) AS count
   FROM   rides
   GROUP  BY 1, 2
   ) r
RIGHT  JOIN (
   SELECT f.id AS from_id, f.name AS from_station
        , t.id AS to_id  , t.name AS to_station
   FROM   stations f CROSS JOIN stations t
   ) s ON  s.from_id = r.from_station_id
      AND  s.to_id   = r.to_station_id
ORDER  BY 1, 2;  -- optional order

Again, it's cheaper to aggregate rides before joining to stations.

To wrap it up as JSON array or records, simply:

SELECT json_agg(sub)
FROM  (
   -- query from above
   ) sub;

db<>fiddle here

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks so much for your answer! Apologies for changing the schema of the tables yet again (I took another look at my database and realized my mistake). Could you take a look at the updated question?
@DakshShah: I adapted accordingly. For your next question, please remember to provide proper information from the start. It's obviously less than ideal to change the nature of the question after answers have been given.
Thanks a lot and sorry for the trouble! I'll ensure I do that.
1

demos:db<>fiddle

SELECT 
    c.from_station,
    c.to_station,
    COUNT(*)
FROM stations s1
JOIN stations s2 ON s1.station <> s2.station                                     -- 1
JOIN connections c ON s1.station = c.from_station AND s2.station = c.to_station  -- 2
GROUP BY c.from_station, c.to_station                                            -- 3
  1. Create a self-join on the station table. The <> join condition ensures that same station will not be joined. All other station will be joined on each other, which creates all combinations
  2. Join this result on your connections table using both station columns from the self-join for the from and to points of the connections
  3. Now you can group by from and to and COUNT(*) aggregate this.

If you want to recognize cases where from station equals to station, you can change

JOIN stations s2 ON s1.station <> s2.station

into a simple:

CROSS JOIN stations s2

If you want to get a JSON object as you shown in the question:

SELECT 
    json_agg(connection)                      -- 2
FROM (
    SELECT 
        json_build_object(                    -- 1
            'fromStation', c.from_station,
            'toStation', c.to_station,
            'count', COUNT(*)
        ) as connection
    FROM stations s1
    JOIN stations s2 ON s1.station <> s2.station
    JOIN connections c ON s1.station = c.from_station AND s2.station = c.to_station
    GROUP BY c.from_station, c.to_station
) s
  1. Create your JSON object from your columns which you created above
  2. Aggregate them into a JSON array.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.