2

I have recently started using dataframes in Python and I don't know how can I do the following exercise.

I have two dataframes, both with the same columns (Type column and Count column) like this:

main_df:

Index Type Count
0 Album 12
1 Book 4
2 Person 3

df2:

Index Type Count
0 Album 9
1 Person 4
2 Film 4

Same Type value can have different Index value, as you can see with Type = Person (Index = 2 in main_df and Index = 1 in df2).

I want to have all data in main_df. In this case result will be:

main_df:

Index Type Count
0 Album 21
1 Book 4
2 Person 7
3 Film 4

If Type column value in df2 is already in main_df, simply sum the corresponding Count value of df2. If Type column value in df2 is not in main_df, add that row (Type and Count value) at the end of main_df.

Hope you can help me with this. Thanks in advance.

2 Answers 2

3

You can try with pd.concat and then use groupby:

main_df = pd.concat([main_df,df2]).groupby('Type', sort=False).agg({'Count': sum}).reset_index()

OUTPUT:

     Type  Count
0   Album     21
1    Book      4
2  Person      7
3    Film      4
Sign up to request clarification or add additional context in comments.

Comments

2

try via append(),groupby() and sum():

out=(df1.append(df2)
        .groupby('Type',as_index=False,sort=False)
        .sum())

Output of out:

    Type    Index   Count
0   Album   0       21
1   Book    1       4
2   Person  3       7
3   Film    2       4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.