3

I am struggling with the following SQL query:

There is a table data_tracks with coordinates describing a trip. Each trip is uniquely identified by a trip_log_id. After reaching the trip's destination, the user needs to participate to a survey. The answers of the survey are stored in a table crowd_sourcing_answers. Each answer belongs to a question, located in table crowd_sourcing_questions.

I have written two SQL queries - one to get all points of a trip as JSON, another one to get all question-answers pairs:

Query for fetching all question-answer pairs of a trip:

SELECT json_agg(answer_single_trip)
FROM (SELECT json_agg(
               json_build_object(
                 'tripId', trip_log_id,
                 'question', qt.question,
                 'answeringOption', qt."answeringOptions",
                 'answer', at.answer
                   )
                 ) as crowdsourcing
      FROM crowd_sourcing_questions as qt
             INNER JOIN crowd_sourcing_answers as at ON at.crowd_sourcing_question_id = qt.id
      GROUP BY trip_log_id) answer_single_trip;

and its output:

[
  {
    "crowdsourcing": [
      {
        "tripId": 92,
        "question": "Gab es auf der Strecke teilweise schlecht befahrbare Streckenabschnitte?",
        "answeringOption": [
          "Ja",
          "Nein"
        ],
        "answer": "2"
      }
    ]
  },
  {
    "crowdsourcing": [
      {
        "tripId": 91,
        "question": "Gab es auf der Strecke teilweise schlecht befahrbare Streckenabschnitte?",
        "answeringOption": [
          "Ja",
          "Nein"
        ],
        "answer": "1"
      }
    ]
  },
  {
    "crowdsourcing": [
      {
        "tripId": 90,
        "question": "Gab es auf der Strecke teilweise schlecht befahrbare Streckenabschnitte?",
        "answeringOption": [
          "Ja",
          "Nein"
        ],
        "answer": "0"
      }
    ]
  }
]     

Query for fetching all points belonging to a trip:

SELECT json_agg(
         json_build_object(
           'tripId', trip_log_id,
           'trackId', id,
           'recorded_at', created_at,
           'latitude', latitude,
           'longitude', longitude
             )
           ) as trips
FROM data_tracks
GROUP by trip_log_id; 

and its output:

[
  [
    {
      "trip_log_id": 91,
      "recorded_at": "2018-10-05T14:11:44.847",
      "latitude": 52.5242370846803,
      "longitude": 13.3443558528637
    },
    {
      "trip_log_id": 91,
      "recorded_at": "2018-10-05T14:11:44.911",
      "latitude": 52.5242366166393,
      "longitude": 13.3443558656828
    }
  ],
  [
    {
      "trip_log_id": 90,
      "recorded_at": "2018-10-05T13:28:24.452",
      "latitude": 52.5242370846803,
      "longitude": 13.3443558528637
    },
    {
      "trip_log_id": 90,
      "recorded_at": "2018-10-05T13:28:24.489",
      "latitude": 52.5242366166393,
      "longitude": 13.3443558656828
    }
  ]
]

Goal

Now I need to merge these two results such that there is one JSON object for each trip ID, containing the question-answer pairs (key: "crowdsourcing"; array) and the points of the trip (key: "trip"; array). Following an example:

[
  {  // DATA FOR TRIP 1
    "crowdsourcing": [
      {
        "question": "Bitte bewerten Sie die Sicherheit der Radroute!",
        "answeringOption": [
          "Sehr sicher",
          "Eher sicher",
          "Neutral",
          "Eher unsicher",
          "Sehr unsicher"
        ],
        "answer": "2"
      },
      {
        "question": "Würden Sie die gefahrene Route anderen Radfahrenden weiterempfehlen?",
        "answeringOption": [
          "Ja",
          "Nein"
        ],
        "answer": "1"
      }
    ],
    "trip": [
      {
        "recorded_at": "2018-10-11T15:16:33",
        "latitude": 52.506785999999998,
        "longitude": 13.398065000000001
      },
      {
        "recorded_at": "2018-10-11T15:16:32.969",
        "latitude": 52.50647,
        "longitude": 13.397856000000001
      },
      {
        "recorded_at": "2018-10-11T15:16:32.936",
        "latitude": 52.506166,
        "longitude": 13.397593000000001
      }
    ]
  },
  { // DATA FOR TRIP 2
    "crowdsourcing": [
      {
        "question": "Bitte bewerten Sie die Sicherheit der Radroute!",
        "answeringOption": [
          "Sehr sicher",
          "Eher sicher",
          "Neutral",
          "Eher unsicher",
          "Sehr unsicher"
        ],
        "answer": "2"
      }
    ],
    "trip": [
      {
        "recorded_at": "2018-10-11T15:33:33.971999",
        "latitude": 52.506785999999998,
        "longitude": 13.398065000000001
      },
      {
        "recorded_at": "2018-10-11T15:33:33.929",
        "latitude": 52.50647,
        "longitude": 13.397856000000001
      }
    ]
  }
]

Approach

I created a query, please see DB Fiddle. However, it returns duplicate records within the two arrays (question-answers pairs, points of trip). I was thinking it has to do something with the JOIN but all my trials failed.

4
  • In your final result you're losing the trip_id. Is this an error? Commented Nov 2, 2018 at 15:40
  • Sorry for the confusion, the output listed above (without any IDs) is the one how it should look like at the end. I just added it for testing purposes, to see whether the IDs of the duplicated records are the same. Commented Nov 2, 2018 at 15:52
  • OK, that's what I meant. In the end you don't need the trip_id anymore? That's right? Commented Nov 2, 2018 at 15:52
  • Yes, that's right! Thanks a lot for your support. Commented Nov 2, 2018 at 16:29

1 Answer 1

2

In your subqueries you have included the trip_log_id into the json part. But if you put them as separate column you'll have the chance to join both parts against it:

demo: db<>fiddle

SELECT
    json_agg(
        json_build_object('crowdsourcing', cs.json_agg, 'trip', t.json_agg)
    )
FROM
(
    SELECT 
        trip_log_id,                          -- 1
        json_agg(
            json_build_object('question', question, 'answeringOption', "answeringOptions", 'answer', answer)
        )
    FROM 
        crowd_sourcing_answers csa
    JOIN crowd_sourcing_questions csq ON csa.crowd_sourcing_question_id = csq.id
    GROUP BY trip_log_id
) cs

JOIN                                         -- 2

(
    SELECT
        trip_log_id,                         -- 1
        json_agg(
           json_build_object('recorded_at', created_at, 'latitude', latitude, 'longitude', longitude)
        )
    FROM data_tracks
    GROUP by trip_log_id 
) t

USING (trip_log_id)                          -- 2      
  1. Get out the trip_log_id
  2. using it for joining

Addionally: Please notice that in postgres all column names should be without any capital letters. I would recommend to rename addionalOptions into something like additional_options. With that no additional " chars are needed.

Sign up to request clarification or add additional context in comments.

2 Comments

If you have a moment, could you advise on how to combine two separate jsonb_aggs into a single json array? One wouldn't necessarily have any common fields with the other. Just want to understand if there's a clean way to concatenate two unrelated json's.
@JayJ Please open a new question with a minimal example. If no other I will have a look at it. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.