0

I have a simple table that has lat, long, and time. Basically, I want the result of my query to give me something like this:

lat,long,hourwindow,count

I can't seem to figure out how to do this. I've tried so many things I can't keep them straight. And unfortunately Here's what I've got so far:

WITH all_lat_long_by_time AS (
    SELECT
      trunc(cast(lat AS NUMERIC), 4) AS lat,
      trunc(cast(long AS NUMERIC), 4) AS long,
      date_trunc('hour', time :: TIMESTAMP WITHOUT TIME ZONE) AS hourWindow

    FROM my_table
),
    unique_lat_long_by_time AS (
      SELECT DISTINCT * FROM all_lat_long_by_time
  ),
  all_with_counts AS (
   -- what do I do here?
  )
SELECT * FROM all_with_counts;
1
  • Please explain how "count of rows by uniqueness" is defined exactly. Do you mean a count of unique rows (after truncating numbers)? So the number of distinct (lat, long) per hour? Postgres version and table definition are always helpful, too. time :: TIMESTAMP WITHOUT TIME ZONE looks suspicious. Commented Mar 20, 2019 at 22:14

2 Answers 2

1

I think this is pretty basic aggregation query:

SELECT date_trunc('hour', time :: TIMESTAMP WITHOUT TIME ZONE) AS hourWindow
       trunc(cast(lat AS NUMERIC), 4) AS lat,
       trunc(cast(long AS NUMERIC), 4) AS long,
       COUNT(*)
FROM my_table
GROUP BY hourWindow, trunc(cast(lat AS NUMERIC), 4), trunc(cast(long AS NUMERIC), 4)
ORDER BY hourWindow
Sign up to request clarification or add additional context in comments.

1 Comment

Ha, when you stare at a problem space for so long that you forget how to SQL. Thanks.
0

If "count of rows by uniqueness" is meant to count distinct coordinates per hour (after truncating the numbers), count(DISTINCT (lat,long)) does the job:

SELECT date_trunc('hour', time::timestamp) AS hour_window
     , count(DISTINCT (trunc( lat::numeric, 4)
                     , trunc(long::numeric, 4))) AS count_distinct_coordinates
FROM   tbl
GROUP  BY 1
ORDER  BY 1;

Details in the manual here.
(lat,long) is a ROW value and short for ROW(lat,long). More here.

But count(DISTINCT ...) is typically slow, a subquery should be faster for your case:

SELECT hour_window, count(*) AS count_distinct_coordinates
FROM  (
   SELECT date_trunc('hour', time::timestamp) AS hour_window
        , trunc( lat::numeric, 4) AS lat
        , trunc(long::numeric, 4) AS long
   FROM   tbl
   GROUP  BY 1, 2, 3
   ) sub
GROUP  BY 1
ORDER  BY 1;

Or:

SELECT hour_window, count(*) AS count_distinct_coordinates
FROM  (
   SELECT DISTINCT
          date_trunc('hour', time::timestamp) AS hour_window
        , trunc( lat::numeric, 4) AS lat
        , trunc(long::numeric, 4) AS long
   FROM   tbl
   ) sub
GROUP  BY 1
ORDER  BY 1;

After the subquery folds duplicates, the outer SELECT can use a plain count(*).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.