Iterate through rows and columns, python

Question

Could you, please, help me to crack the calculation?

I have the following table:

What I need to do is to calculate the expected frequency as (row total * col total) / grand total

The expected result:

I assume that I need to iterate through rows and columns. I have tried to do it with:

for i, row in df_dropped.iterrows():
for j, column in row.iteritems():
    data[row][column] = df_dropped.iloc[i, 3] * df_dropped.iloc[2, j]

The error appears: Location based indexing can only have [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array] types

What am I doing wrong?

jezrael · Accepted Answer · 2018-11-29 13:15:00Z

2

Use numpy.outer for outer product of last column and last row and divide by
scalar selected by loc to numpy array:

t = df.loc['col_sum', 'row_sum']
arr = np.outer(df['row_sum'], df.loc['col_sum']) / t

Then create DataFrame by contructor with indexing for remove last column ans row:

df1 = pd.DataFrame(arr[:-1, :-1], 
                   columns=df.columns[:-1],
                   index=df.index[:-1]).add_prefix('exp_')
print (df1)
   exp_satisfied  exp_neutral  exp_dissatisfied
0      24.605263    20.842105          9.552632
1     145.394737   123.157895         56.447368

Get new columns names:

cols = [item for x in df.columns[:-1] for item in (x, 'exp_' + x)]
print (cols)
['satisfied', 'exp_satisfied', 'neutral', 'exp_neutral', 'dissatisfied', 'exp_dissatisfied']

Join together by concat and reindex for expected ordering of columns:

df = pd.concat([df.iloc[:-1, :-1], df1], axis=1).reindex(columns=cols)
print (df)
   satisfied  exp_satisfied  neutral  exp_neutral  dissatisfied  \
0         30      24.605263       17    20.842105             8   
1        140     145.394737      127   123.157895            58   

   exp_dissatisfied  
0          9.552632  
1         56.447368

answered Nov 29, 2018 at 13:15

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

eponkratova Over a year ago

Thank you, jezrael, it is beatifully simple

eponkratova Over a year ago

Just one more question. Here is the final result: def expected_frequency(data): """The function calculates expected frequency""" data['row_sum'] = data.sum(axis = 1) data.loc['col_sum'] = data.sum() t = data.loc['col_sum', 'row_sum'] arr = np.outer(data['row_sum'], data.loc['col_sum']) / float(t) data2 = pd.DataFrame(arr[:-1, :-1], columns = data.columns[:-1]).add_prefix('exp_') data = pd.concat([data.iloc[:-1, :-1], data2], axis = 1) return data expected_frequency(df_dropped). My questions, how to store the function table as a permanent table?

jezrael Over a year ago

@eponkratova - yes?

jezrael Over a year ago

Not sure if understand, do you think assign? df1 = expected_frequency(df) - apply function to DataFrame called df and assign to df1

eponkratova Over a year ago

Yeah...that was easy. I thought of creating an empty df and then, applying a function to it, if makes sense. It is good to be new, you could come up with completely ridiculous solutions.

Sander van den Oord · Accepted Answer · 2018-11-29 15:42:56Z

1

Jezrael gave a great answer in which you are calculating the expected frequencies using numpy and pandas. You can also use the python statistical libary statsmodels to calculate these kinds of statistics.

For example to calculate a table of expected frequencies, you could do:

import statsmodels.api as sm
expected_values = sm.stats.Table(df).fittedvalues

More info on: statsmodels contingency tables

answered Nov 29, 2018 at 15:42

Sander van den Oord

13k5 gold badges72 silver badges124 bronze badges

1 Comment

eponkratova Over a year ago

Yeah. I also feel that crosstab could work. Thank you for an one-line solution!

Collectives™ on Stack Overflow

Iterate through rows and columns, python

2 Answers 2

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related