0

I have a dataframe that is created from a pivot table, and looks similar to this:

            import pandas as pd
            d = {
                    ('company1', 'False Negative'): {'April- 2012': 112.0, 'April- 2013': 370.0, 'April- 2014': 499.0,
                    'August- 2012': 431.0, 'August- 2013': 496.0, 'August- 2014': 221.0},
                    ('company1', 'False Positive'): {'April- 2012': 0.0, 'April- 2013'  544.0, 
                    'April- 2014': 50.0, 'August- 2012': 0.0, 'August- 2013': 0.0, 'August- 2014': 426.0}, 
                    ('company1', 'True Positive'): {'April- 2012': 0.0, 'April- 2013': 140.0, 
                    'April- 2014': 24.0, 'August- 2012': 0.0, 'August- 2013': 0.0,'August- 2014': 77.0},
                    ('company2', 'False Negative'): {'April- 2012': 112.0, 'April- 2013': 370.0, 
                    'April- 2014': 499.0, 'August- 2012': 431.0, 'August- 2013': 496.0, 'August- 2014': 221.0},
                    ('company2', 'False Positive'): {'April- 2012': 0.0, 'April- 2013': 544.0, 
                    'April- 2014': 50.0, 'August- 2012': 0.0, 'August- 2013': 0.0, 'August- 2014': 426.0},
                    ('company2', 'True Positive'): {'April- 2012': 0.0, 'April- 2013': 140.0, 'April- 2014': 24.0,
                    'August- 2012': 0.0, 'August- 2013': 0.0,'August- 2014': 77.0}
                }
            df = pd.DataFrame(d)

            company1    company2
            FN  FP  TP  FN  FP  TP
            April- 2012     112 0   0   112 0   0
            April- 2013     370 544 140 370 544 140
            April- 2014     499 50  24  499 50  24
            August- 2012    431 0   0   431 0   0
            August- 2013    496 0   0   496 0   0
            August- 2014    221 426 77  221 426 77

I'm looking to iterative over the upper level of the multiindex column to create a sum column for each company:

FSUM = FN + FP

SUM = FN + FP + TP

                            company1               company2
                            FN  FP  TP  FSUM  SUM  FN   FP  TP   FSUM  SUM
            April- 2012     112 0   0   112  112   112  0   0    112   112
            April- 2013     370 544 140 914  1054  370  544 140  914   1054
            April- 2014     499 50  24  549  573   499  50  24   549   573
            August- 2012    431 0   0   431  431   431  0   0    431   431
            August- 2013    496 0   0   496  496   496  0   0    496   496
            August- 2014    221 426 77  647  724   221  426 77   647   724

I don't know the company names beforehand, so it will need to loop

2 Answers 2

2

You can get it a bit easier by using some .stacks and .unstacks to regroup things:

n [96]: df = df.unstack().unstack(1)

In [97]: df
Out[97]:
                       False Negative  False Positive  True Positive
company1 April- 2012            112.0             0.0            0.0
         April- 2013            370.0           544.0          140.0
         April- 2014            499.0            50.0           24.0
         August- 2012           431.0             0.0            0.0
         August- 2013           496.0             0.0            0.0
         August- 2014           221.0           426.0           77.0
company2 April- 2012            112.0             0.0            0.0
         April- 2013            370.0           544.0          140.0
         April- 2014            499.0            50.0           24.0
         August- 2012           431.0             0.0            0.0
         August- 2013           496.0             0.0            0.0
         August- 2014           221.0           426.0           77.0

In [98]: df['SUM'] = df.sum(axis=1)

In [99]: df['FSUM'] = df['False Negative'] + df['False Positive']

In [100]: df = df.stack().unstack([0,2])

In [101]: df
Out[101]:
                   company1                                              \
             False Negative False Positive True Positive     SUM   FSUM
April- 2012           112.0            0.0           0.0   112.0  112.0
April- 2013           370.0          544.0         140.0  1054.0  914.0
April- 2014           499.0           50.0          24.0   573.0  549.0
August- 2012          431.0            0.0           0.0   431.0  431.0
August- 2013          496.0            0.0           0.0   496.0  496.0
August- 2014          221.0          426.0          77.0   724.0  647.0

                   company2
             False Negative False Positive True Positive     SUM   FSUM
April- 2012           112.0            0.0           0.0   112.0  112.0
April- 2013           370.0          544.0         140.0  1054.0  914.0
April- 2014           499.0           50.0          24.0   573.0  549.0
August- 2012          431.0            0.0           0.0   431.0  431.0
August- 2013          496.0            0.0           0.0   496.0  496.0
August- 2014          221.0          426.0          77.0   724.0  647.0
Sign up to request clarification or add additional context in comments.

Comments

1

One way you can do it is to use sum with level commands and then pd.concat, lastly sort_index:

pd.concat([df,
           df.loc(axis=1)[:,['False Negative','False Positive']].sum(level=0, axis=1).assign(indx2 = 'FSUM').set_index('indx2', append=True).unstack(),
           df.sum(level=0, axis=1).assign(indx2='SUM').set_index('indx2', append=True).unstack()],
          axis=1).sort_index(axis=1)

Output:

             company1                                                      \
                 FSUM False Negative False Positive     SUM True Positive   
April- 2012     112.0          112.0            0.0   112.0           0.0   
April- 2013     914.0          370.0          544.0  1054.0         140.0   
April- 2014     549.0          499.0           50.0   573.0          24.0   
August- 2012    431.0          431.0            0.0   431.0           0.0   
August- 2013    496.0          496.0            0.0   496.0           0.0   
August- 2014    647.0          221.0          426.0   724.0          77.0   

             company2                                                      
                 FSUM False Negative False Positive     SUM True Positive  
April- 2012     112.0          112.0            0.0   112.0           0.0  
April- 2013     914.0          370.0          544.0  1054.0         140.0  
April- 2014     549.0          499.0           50.0   573.0          24.0  
August- 2012    431.0          431.0            0.0   431.0           0.0  
August- 2013    496.0          496.0            0.0   496.0           0.0  
August- 2014    647.0          221.0          426.0   724.0          77.0  

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.