I would like to calculate the daily sales from average sales using the following function:
def derive_daily_sales(avg_sales_series, period, first_day_sales):
"""
derive the daily sales from previous_avg_sales start date to current_avg_sales end date
for detail formula, please refer to README.md
@avg_sales_series: an array of avg sales(e.g. 2020-08-04 to 2020-08-06)
@period: the averaging period in days (e.g. 30 days, 90 days)
@first_day_sales: the sales at the first day of previous_avg_sales
"""
x_n1 = avg_sales_series[-1]*period - avg_sales_series[0]*period + first_day_sales
return x_n1
The avg_sales_series is supposed to be a pandas series.
The dataframe looks like the following:
date, customer_id, avg_30_day_sales
12/08/2020, 1, 30
13/08/2020, 1, 40
14/08/2020, 1, 40
12/08/2020, 2, 20
13/08/2020, 2, 40
14/08/2020, 2, 30
I would like to first groupby customer_id and sort by date. Then, get the rolling window of size 2. And apply the custom function derive_daily_sales assuming that period=30 and first_day_sales equal to the first avg_30_day_sales.
I tried:
df_sales_grouped = df_sales.sort_values('date').groupby(['customer_id','date'])]
df_daily_sales['daily_sales'] = df_sales_grouped['avg_30_day_sales'].rolling(2).apply(derive_daily_sales, axis=1, period=30, first_day_sales= df_sales['avg_30_day_sales'][0])