Consider this dataframe:
STUDENT T_1 T_2 T_3 T_4
0 A PASS FAIL PASS FAIL
1 B PASS FAIL FAIL FAIL
2 C FAIL FAIL PASS PASS
3 D PASS FAIL PASS PASS
The columns T_1 -> T_4 represent tests. In this case, T_1 and T_3 are tests of type 'X', and T_2 and T_4 are tests of type 'Y'. The columns are categorical values. I want to get a % distribution per test type (i.e., X/Y). So I want this:
STATUS X Y
0 PASS 0.75 (6/8) 0.25 (2/8)
1 FAIL 0.25 (2/8) 0.75 (6/8)
I know I can use s.value_counts() / s.count() on a series to get the % status distribution per column, but how do I aggregate over multiple columns (i.e., combine T_1/T_3, T_2/T_4 since I know they belong to a particular test type)