I want to add missing dates for a specific date range, but keep all columns. I found many posts using afreq(), resample(), reindex(), but they seemed to be for Series and I couldn't get them to work for my DataFrame.
Given a sample dataframe:
data = [{'id' : '123', 'product' : 'apple', 'color' : 'red', 'qty' : 10, 'week' : '2019-3-7'}, {'id' : '123', 'product' : 'apple', 'color' : 'blue', 'qty' : 20, 'week' : '2019-3-21'}, {'id' : '123', 'product' : 'orange', 'color' : 'orange', 'qty' : 8, 'week' : '2019-3-21'}]
df = pd.DataFrame(data)
color id product qty week
0 red 123 apple 10 2019-3-7
1 blue 123 apple 20 2019-3-21
2 orange 123 orange 8 2019-3-21
My goal is to return below; filling in qty as 0, but fill other columns. Of course, I have many other ids. I would like to be able to specify the start/end dates to fill; this example uses 3/7 to 3/21.
color id product qty week
0 red 123 apple 10 2019-3-7
1 blue 123 apple 20 2019-3-21
2 orange 123 orange 8 2019-3-21
3 red 123 apple 0 2019-3-14
4 red 123 apple 0 2019-3-21
5 blue 123 apple 0 2019-3-7
6 blue 123 apple 0 2019-3-14
7 orange 123 orange 0 2019-3-7
8 orange 123 orange 0 2019-3-14
How can I keep the remainder of my DataFrame intact?