Mixing DISTINCT with GROUP_BY Postgres

Question

I am trying to get a list of:

all months in a specified year that,
have at least 2 unique rows based on their date
and ignore specific column values

where I got to is:

SELECT DATE_PART('month', "orderDate") AS month, count(*) 
FROM public."Orders"
WHERE "companyId" = 00001 AND "orderNumber" != 1 and DATE_PART('year', ("orderDate")) = '2020' AND "orderNumber" != NULL
GROUP BY month
HAVING COUNT ("orderDate") > 2

The HAVING_COUNT sort of works in place of DISTINCT insofar as I can be reasonably sure that condition filters the condition of data required. However, being able to use DISTINCT based on a given date within a month would return a more reliable result. Is this possible with Postgres?

A sample line of data from the table:

Sample Input

"2018-12-17 20:32:00+00"
"2019-02-26 14:38:00+00"
"2020-07-26 10:19:00+00"
"2020-10-13 19:15:00+00"
"2020-10-26 16:42:00+00"
"2020-10-26 19:41:00+00"
"2020-11-19 20:21:00+00"
"2020-11-19 21:22:00+00"
"2020-11-23 21:10:00+00"
"2021-01-02 12:51:00+00"

without the HAVING_COUNT this produces

month	count
7	1
10	2
11	3

Month 7 can be discarded easily as only 1 record. Month 10 is the issue: we have two records. But from the data above, those records are from the same day. Similarly, month 11 only has 2 distinct records by day.

The output should therefore be ideally:

month	count
11	2

We have only two distinct dates from the 2020 data, and they are from month 11 (November)

Sample line of data won't help much. We need to see a handful of input records, along with the output, and an explanation of the output. — Tim Biegeleisen
– Tim Biegeleisen, Commented Sep 10, 2021 at 4:04
@TimBiegeleisen ah, apologies, I did wonder how the schema would help :/ make much more sense. edited. — aroundtheworld
– aroundtheworld, Commented Sep 10, 2021 at 5:12
Thanks for the edits. I was able to provide an answer below. Lesson learned: For SQL questions on this site, showing sample data may be critical to getting an answer. — Tim Biegeleisen
– Tim Biegeleisen, Commented Sep 10, 2021 at 5:17

Tim Biegeleisen · Accepted Answer · 2021-09-10 05:16:19Z

1

I think you just want to take the distinct count of dates for each month:

SELECT
    DATE_PART('month', orderDate) AS month,
    COUNT(DISTINCT orderDate::date) AS count
FROM Orders
WHERE
    companyId = 1 AND
    orderNumber != 1 AND
    DATE_PART('year', orderDate) = '2020'
GROUP BY
    DATE_PART('month', orderDate)
HAVING
    COUNT(DISTINCT orderDate::date) > 2;

answered Sep 10, 2021 at 5:16

Tim Biegeleisen

526k32 gold badges323 silver badges399 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

aroundtheworld Over a year ago

Great, thank you and for the edit prompts - definitely will help for framing any future SQL questions

Collectives™ on Stack Overflow

Mixing DISTINCT with GROUP_BY Postgres

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related