I have a 2x1 pandas dataframe where the 2 cells contain numpy arrays:
>>> import numpy as np
>>> import pandas as pd
>>> a0 = np.array([[1, 2], [2, 2]])
>>> a1 = np.array([[3, 2], [1, 1]])
>>> df = pd.DataFrame([[a0], [a1]])
I can compute the element-wise mean of the two arrays as follows:
>>> np.mean(df[0])
array([[ 2. , 2. ],
[ 1.5, 1.5]])
Now I want to consider the case where at least one of the arrays contains nan/s, e.g.
>>> a0 = np.array([[1, 2], [2, np.nan]])
>>> a1 = np.array([[3, 2], [1, 1]])
>>> df = pd.DataFrame([[a0], [a1]])
The mean method used above gives
>>> np.mean(df[0])
array([[ 2. , 2. ],
[ 1.5, nan]])
as expected. I want the nan/s to be ignored though. I would have expected the following to work
>>> np.nanmean(df[0])
array([[ -4., -4.],
[ -3., nan]])
but it obviously doesn't.
So, my question: how can I compute element-wise means of numpy arrays which are contained in the cells of a pandas dataframe?
np.nanmean(df[0])would bearray([[ 2. , 2. ], [ 1.5, 1]])?np.array([[2., 2.], [1.5, 1.]]).