2

I want to add missing dates for a specific date range, but keep all columns. I found many posts using afreq(), resample(), reindex(), but they seemed to be for Series and I couldn't get them to work for my DataFrame.

Given a sample dataframe:

data = [{'id' : '123', 'product' : 'apple', 'color' : 'red', 'qty' : 10, 'week' : '2019-3-7'}, {'id' : '123', 'product' : 'apple', 'color' : 'blue', 'qty' : 20, 'week' : '2019-3-21'}, {'id' : '123', 'product' : 'orange', 'color' : 'orange', 'qty' : 8, 'week' : '2019-3-21'}]

df = pd.DataFrame(data)


    color   id product  qty       week
0     red  123   apple   10   2019-3-7
1    blue  123   apple   20  2019-3-21
2  orange  123  orange    8  2019-3-21

My goal is to return below; filling in qty as 0, but fill other columns. Of course, I have many other ids. I would like to be able to specify the start/end dates to fill; this example uses 3/7 to 3/21.

    color   id product  qty       week
0     red  123   apple   10   2019-3-7
1    blue  123   apple   20  2019-3-21
2  orange  123  orange    8  2019-3-21
3     red  123   apple    0  2019-3-14
4     red  123   apple    0  2019-3-21 
5    blue  123   apple    0   2019-3-7
6    blue  123   apple    0  2019-3-14
7  orange  123  orange    0   2019-3-7
8  orange  123  orange    0  2019-3-14

How can I keep the remainder of my DataFrame intact?

3 Answers 3

2

In you case , you just need do with unstack and stack + reindex

df.week=pd.to_datetime(df.week)
s=pd.date_range(df.week.min(),df.week.max(),freq='7 D')

df=df.set_index(['color','id','product','week']).\
      qty.unstack().reindex(columns=s,fill_value=0).stack().reset_index()
df

    color   id product    level_3     0
0    blue  123   apple 2019-03-14   0.0
1    blue  123   apple 2019-03-21  20.0
2  orange  123  orange 2019-03-14   0.0
3  orange  123  orange 2019-03-21   8.0
4     red  123   apple 2019-03-07  10.0
5     red  123   apple 2019-03-14   0.0
Sign up to request clarification or add additional context in comments.

Comments

0

One option is to use the complete function from pyjanitor to expose the implicitly missing rows; afterwards you can fill with fillna:

# pip install pyjanitor
import pandas as pd
import janitor

df.week = pd.to_datetime(df.week)

# create new dates, which will be used to expand the dataframe
new_dates = {"week": pd.date_range(df.week.min(), df.week.max(), freq="7D")}

# use the complete function
# note how color, id and product are wrapped together 
# this ensures only missing values based on data in the dataframe is exposed
# if you want all combinations, then you get rid of the tuple,
(df
.complete(("color", "id", "product"), new_dates, sort = False)
.fillna({'qty':0, downcast='infer')
)

    id product   color  qty       week
0  123   apple     red   10 2019-03-07
1  123   apple    blue   20 2019-03-21
2  123  orange  orange    8 2019-03-21
3  123   apple     red    0 2019-03-14
4  123   apple     red    0 2019-03-21
5  123   apple    blue    0 2019-03-07
6  123   apple    blue    0 2019-03-14
7  123  orange  orange    0 2019-03-07
8  123  orange  orange    0 2019-03-14

Comments

0
duckdb:

pd.date_range(start='2019-3-7', end="2019-3-21", freq='7d').to_frame("week2").sql.select('"0" week2').join(df1.sql.select("*,week::datetime week1"),condition="1=1").select("color,id,product,case when week1=week2 then qty else 0 end qty,week")

┌─────────┬─────────┬─────────┬───────┬───────────┐
│  color  │   id    │ product │  qty  │   week    │
│ varchar │ varchar │ varchar │ int64 │  varchar  │
├─────────┼─────────┼─────────┼───────┼───────────┤
│ red     │ 123     │ apple   │    10 │ 2019-3-7  │
│ red     │ 123     │ apple   │     0 │ 2019-3-7  │
│ red     │ 123     │ apple   │     0 │ 2019-3-7  │
│ blue    │ 123     │ apple   │     0 │ 2019-3-21 │
│ blue    │ 123     │ apple   │     0 │ 2019-3-21 │
│ blue    │ 123     │ apple   │    20 │ 2019-3-21 │
│ orange  │ 123     │ orange  │     0 │ 2019-3-21 │
│ orange  │ 123     │ orange  │     0 │ 2019-3-21 │
│ orange  │ 123     │ orange  │     8 │ 2019-3-21 │
└─────────┴─────────┴─────────┴───────┴───────────┘

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.