Remove specific values from dataframe

Question

I have the following correlation matrix:

symbol    abc    xyz    ghj
symbol    
abc       1      0.1    -0.2
xyz       0.1    1       0.3
ghj      -0.2    0.3     1

I need to be able to find the standard deviation for the whole dataframe but that has to exclude the perfect correlation values, ie: the standard deviation must not take into account abc:abc, xyz:xyz, ghj:ghj

I am able to get the standard deviation for the entire dataframe using:

df.stack().std()

But this takes into account every single value which is not correct. The standard deviation should not include row/column combinations where an item is being correlated to itself (ie: 1). Is there a way to remove abc:abc, xyz:xyz, ghj:ghj. Then calculate the standard deviation.

Perhaps converting it to a dict or something?

daniel451 · Accepted Answer · 2015-11-02 06:48:43Z

1

If you use numpy you can utilize np.extract and np.std:

In [61]: import numpy as np

In [62]: a = np.array([[ 1. ,  0.1, -0.2],
                       [ 0.1,  1. ,  0.3],
                       [-0.2,  0.3,  1. ]])

In [63]: a
Out[63]: 
array([[ 1. ,  0.1, -0.2],
       [ 0.1,  1. ,  0.3],
       [-0.2,  0.3,  1. ]])

In [64]: calc_std = np.std(np.extract(a != 1, a))

In [65]: calc_std
Out[65]: 0.20548046676563256

np.extract(a != 1, a)) returns an array containing each element of a which is not equal to 1.

The returned array looks like this:

In [66]: np.extract(a != 1, a)
Out[66]: array([ 0.1, -0.2,  0.1,  0.3, -0.2,  0.3])

After this extraction you can easily calculate the standard deviation with np.std().

edited Nov 2, 2015 at 6:48

answered Nov 2, 2015 at 6:35

daniel451

11.1k22 gold badges76 silver badges131 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Remove specific values from dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related