0

I have a relation in a PostgreSQL database called 'processed_data' having the following schema:

Date -> date type, shop_id -> integer type, item_category_id -> integer type, sum_item_cnt_day -> real type.

Displaying the first 5 rows of the relation is as follows:

date  |  shop_id  |  item_category_id  |  sum_item_cnt_day

------+-----------+--------------------+------------------

2014-12-29  |  49  |  3  |  4

2014-12-29  |  49  |  6  |  1

2014-12-29  |  49  |  7  |  1

2014-12-29  |  49  |  12  |  3

2014-12-29  |  49  |  16  |  1

Now, the 'shop_id' has 60 unique shops ranging from 0-59 where each shop sells some items grouped to a new column 'item_category_id' where 'sum_item_cnt_day' denotes the number of items sold by a shop and it's item_category_id.

I am now trying to further aggregate the data by just trying to get the following columns as final result-

date, shop_id, sum_item_cnt_day

So that, data is aggregated according to number of all items sold in 'item_category_id' per shop (denoted by 'shop_id') and calculating sum of 'sum_item_cnt_day'.

When I try to execute the following SQL command-

select date, shop_id, sum(sum_item_cnt_day) from processed_data group by shop_id;

It gives the error-

ERROR: column "processed_data.date" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: select date, shop_id, sum(sum_item_cnt_day) from processed_d... ^

Even the following SQL command-

select date, shop_id, sum(sum_item_cnt_day) from processed_data where date between '2013-01-01' and '2013-01-31' group by shop_id;

Gives the error-

ERROR: column "processed_data.date" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: select date, shop_id, sum(sum_item_cnt_day) from processed_d... ^

Any suggestions as to what's going wrong and what am I missing?

Thanks!

1 Answer 1

1

The simplest fix, which is what I think you want, would be to just add date to the GROUP BY clause:

SELECT date, shop_id, SUM(sum_item_cnt_day)
FROM processed_data
GROUP BY date, shop_id;

If you really don't want sums taken for each shop on each day, but rather for each shop over all days, then you will have to think of which of the many dates you want to display.

Sign up to request clarification or add additional context in comments.

3 Comments

Is there a way to group data for the relation such that 'sum_item_cnt_day' is summed in a date wise manner for each unique/distinct combination of 'shop_id' and 'item_category_id' ?
@Arun Maybe, I'd have to see data to know what you have in mind. But, this is a departure from your current question, so I would suggest that you open a new one.
I am trying to fit different machine learning and deep learning models pertaining to time series analysis and forecasting so I know how to shape data for the models. I will open a new question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.