1

I have a DataFrame that can be grouped basically by two columns: Level and Sub_level.

The data looks like this:

    Level_1    Sub_level   Value

0    Group A   A1          100
1    Group A   A2          200
2    Group A   A1          150
3    Group B   B1          100
4    Group B   B2          200
5    Group A   A1          200
6    Group A   A1          300
7    Group A   A1          400
8    Group B   B2          450
...

I would like to get the frequency/count in each Sub_level compared to each comparable Level_1, i.e

Level_1   Sub_level   Pct_of_total

Group A   A1          5 / 6  (as there are 6 Group A instances in 'Level_1', and 5 A1:s in 'Sub_level')
          A2          1 / 6 
Group B   B1          1 / 3  (as there are 3 Group B instances in 'Level_1', and 1 B1:s in 'Sub_level')
          B2          2 / 3

Of course the fractions in the new column Pct_of_total should be expressed in percentage.

Any clues?

Thanks,

/N

1

1 Answer 1

1

I think you need groupby + size for first df and then groupby by first level (Level_1) and transform sum. Last divide by div:

df1 = df.groupby(['Level_1','Sub_level'])['Value'].size()
print (df1)
Level_1  Sub_level
Group A  A1           5
         A2           1
Group B  B1           1
         B2           2
Name: Value, dtype: int64

df2 = df1.groupby(level=0).transform('sum')
print (df2)
Level_1  Sub_level
Group A  A1           6
         A2           6
Group B  B1           3
         B2           3
Name: Value, dtype: int64

df3 = df1.div(df2).reset_index(name='Pct_of_total')
print (df3)
   Level_1 Sub_level  Pct_of_total
0  Group A        A1      0.833333
1  Group A        A2      0.166667
2  Group B        B1      0.333333
3  Group B        B2      0.666667
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.