1

I have extracted details using group by on months. But unfortunately, if the data is not present in the months, no record is created for that month.

Please help me to create new records for each of the company where data range starts from 2019 April annd ends at 2020 February.

Company     year    month   Quantity
A           2019    4          1
A           2019    5          12
A           2019    6          13
A           2019    11         23
A           2020    2          34
B           2019    8          32
B           2019    12         2
B           2020    2          32

Exactly in the below format. Any inputs can be a great help.

Company     year    month   Quantity
A           2019    1          0
A           2019    2          0
A           2019    3          0
A           2019    4          0
A           2019    5          12
A           2019    6          13
A           2019    7          0
A           2019    8          0
A           2019    9          0
A           2019    10         0
A           2019    11         23
A           2019    12         0
A           2020    1          0
A           2020    2          34
B           2019    1          0
B           2019    2          0
B           2019    3          0
B           2019    4          0
B           2019    5          0
B           2019    6          0
B           2019    7          0
B           2019    8          32
B           2019    9          0
B           2019    10         0
B           2019    11         0
B           2019    12         2
B           2020    1          0
B           2020    2          32

2 Answers 2

1

If want full months use MultiIndex.from_product with Series.reindex:

mux = pd.MultiIndex.from_product([df['Company'].unique(),
                                  df['year'].unique(),
                                  range(1, 13)], names=['Company','year','month'])

df = df.set_index(['Company','year','month']).reindex(mux, fill_value=0).reset_index()

If necessary filter out by minimal and maximal datetimes by original data use Series.between by maximal and minimal datetimes:

orig = pd.to_datetime(df[['year','month']].assign(day=1))
new = pd.to_datetime(df1[['year','month']].assign(day=1))

df1 = df1[new.between(orig.min(), orig.max())]
print (df1)
   Company  year  month  Quantity
3        A  2019      4         1
4        A  2019      5        12
5        A  2019      6        13
6        A  2019      7         0
7        A  2019      8         0
8        A  2019      9         0
9        A  2019     10         0
10       A  2019     11        23
11       A  2019     12         0
12       A  2020      1         0
13       A  2020      2        34
27       B  2019      4         0
28       B  2019      5         0
29       B  2019      6         0
30       B  2019      7         0
31       B  2019      8        32
32       B  2019      9         0
33       B  2019     10         0
34       B  2019     11         0
35       B  2019     12         2
36       B  2020      1         0
37       B  2020      2        32
Sign up to request clarification or add additional context in comments.

Comments

0

You can create a date range containing all the months, then reindex the dataframe:

# `all_date` contains every month start from min to max
date = pd.to_datetime(df[['year','month']].assign(day=1))
all_date = pd.date_range(month.min(), month.max(), freq='MS')

# the new_index has every combination of (Company, month start)
new_index = pd.MultiIndex.from_product([
    df['Company'].unique(),
    all_date
], names=['Company', 'date'])

# Reindex and filling the NAs
result = df.set_index(['Company', date]).reindex(new_index).reset_index()
result['year'] = result['date'].dt.year
result['month'] = result['date'].dt.month
result['Quantity'] = result['Quantity'].fillna(0)

Tip: do not split a date into year and month. A date is often easier to analyze if you keep it as a Timestamp instead of splitting out its components.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.