2

I want to apply some aggregation on a numpy array.

x = np.array([ ([[ 1.87918162,  1.12919822, -1.63856741],\
       [ 0.40560484,  0.96425656,  0.7847214 ],\
       [-0.83472207,  0.88918246, -0.83298299],\
       [-1.29211004,  0.71730071, -2.09109609],\
       [-1.65800248,  0.49154087,  0.14932455]]),\
 ([[ 1.87918162,  1.12919822, -1.63856741],\
       [-0.21786626, -0.23561859, -0.19750753],\
       [-0.83472207,  0.88918246, -0.83298299],\
       [-0.34967282,  0.51348973, -0.30882943],\
       [ 0.35654636, -0.64453956, -1.3066075 ],\
       [ 0.187328  , -1.32496725, -0.05783984]])])
print type(x)
print x[0]
print np.mean(x[0], axis=0)
print np.mean(x, axis=0)

>>> <type 'numpy.ndarray'>
>>> [[1.87918162, 1.12919822, -1.63856741], [0.40560484, 0.96425656, 0.7847214], [-0.83472207, 0.88918246, -0.83298299], [-1.29211004, 0.71730071, -2.09109609], [-1.65800248, 0.49154087, 0.14932455]]
>>> [-0.30000963  0.83829576 -0.72572011]

And the error is :

TypeError: unsupported operand type(s) for /: 'list' and 'long'

I don't understand why is it working for one row but not on the whole array. I suspect that the irregularity in the shape of the array causes the problem.
But how can I deal with that without iterate with a for loop over the array and concatenate all the results in one array ?

EDIT :

The expected result is the sum of each row vertically. So the result should be an array of dimensions (2,3).

1
  • The structure is indeed a bit weird. Looks more like a mixed index 3 dimensional structure. What exactly do you want aggregated? Or put another way, what do you expect the results to be? Commented Mar 24, 2016 at 16:07

1 Answer 1

2

You input is a NumPy array of datatype = Object and with ragged data format, so you can't use something like np.mean(x, axis=0). Instead for such a case, you can stack those rows vertically and then use np.add.reduceat to perform sum reductions until the end of lengths for each element in x along axis=0. Thus, we would have an almost vectorized approach (almost because we are getting the lengths of each element of x with a loop comprehension, but that isn't computationally intensive), like so -

lens = np.array([len(i) for i in x])
cut_idx = np.append(0,lens[:-1]).cumsum()
out = np.add.reduceat(np.vstack(x),cut_idx,axis=0)/lens[:,None]

Here's a sample run for an extended version of the sample input listed in the question -

In [89]: x = np.array([ ([[ 1.87918162,  1.12919822, -1.63856741],\
    ...:        [ 0.40560484,  0.96425656,  0.7847214 ],\
    ...:        [-0.83472207,  0.88918246, -0.83298299],\
    ...:        [-1.29211004,  0.71730071, -2.09109609],\
    ...:        [-1.65800248,  0.49154087,  0.14932455]]),\
    ...:  ([[ 1.87918162,  1.12919822, -1.63856741],\
    ...:        [-0.21786626, -0.23561859, -0.19750753],\
    ...:        [-0.83472207,  0.88918246, -0.83298299],\
    ...:        [-0.34967282,  0.51348973, -0.30882943],\
    ...:        [ 0.35654636, -0.64453956, -1.3066075 ],\
    ...:        [ 0.187328  , -1.32496725, -0.05783984]]),\
    ...: ([[ 1.87918162,  1.12919822, -1.63856741],\
    ...:        [-1.29211004,  0.71730071, -2.09109609],\
    ...:        [-1.65800248,  0.49154087,  0.14932455]])       
    ...:        ])

In [90]: np.mean(x[0], axis=0)
Out[90]: array([-0.30000963,  0.83829576, -0.72572011])

In [91]: np.mean(x[1], axis=0)
Out[91]: array([ 0.17013247,  0.0544575 , -0.72372245])

In [92]: np.mean(x[2], axis=0)
Out[92]: array([-0.35697697,  0.7793466 , -1.19344632])

In [93]: out
Out[93]: 
array([[-0.30000963,  0.83829576, -0.72572011],
       [ 0.17013247,  0.0544575 , -0.72372245],
       [-0.35697697,  0.7793466 , -1.19344632]])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.