numpy - apply aggregation on each row of array

Question

I want to apply some aggregation on a numpy array.

x = np.array([ ([[ 1.87918162,  1.12919822, -1.63856741],\
       [ 0.40560484,  0.96425656,  0.7847214 ],\
       [-0.83472207,  0.88918246, -0.83298299],\
       [-1.29211004,  0.71730071, -2.09109609],\
       [-1.65800248,  0.49154087,  0.14932455]]),\
 ([[ 1.87918162,  1.12919822, -1.63856741],\
       [-0.21786626, -0.23561859, -0.19750753],\
       [-0.83472207,  0.88918246, -0.83298299],\
       [-0.34967282,  0.51348973, -0.30882943],\
       [ 0.35654636, -0.64453956, -1.3066075 ],\
       [ 0.187328  , -1.32496725, -0.05783984]])])
print type(x)
print x[0]
print np.mean(x[0], axis=0)
print np.mean(x, axis=0)

>>> <type 'numpy.ndarray'>
>>> [[1.87918162, 1.12919822, -1.63856741], [0.40560484, 0.96425656, 0.7847214], [-0.83472207, 0.88918246, -0.83298299], [-1.29211004, 0.71730071, -2.09109609], [-1.65800248, 0.49154087, 0.14932455]]
>>> [-0.30000963  0.83829576 -0.72572011]

And the error is :

TypeError: unsupported operand type(s) for /: 'list' and 'long'

I don't understand why is it working for one row but not on the whole array. I suspect that the irregularity in the shape of the array causes the problem.
But how can I deal with that without iterate with a for loop over the array and concatenate all the results in one array ?

EDIT :

The expected result is the sum of each row vertically. So the result should be an array of dimensions (2,3).

The structure is indeed a bit weird. Looks more like a mixed index 3 dimensional structure. What exactly do you want aggregated? Or put another way, what do you expect the results to be? — Spinor8
– Spinor8, Commented Mar 24, 2016 at 16:07

Divakar · Accepted Answer · 2016-03-24 16:47:55Z

You input is a NumPy array of datatype = Object and with ragged data format, so you can't use something like np.mean(x, axis=0). Instead for such a case, you can stack those rows vertically and then use np.add.reduceat to perform sum reductions until the end of lengths for each element in x along axis=0. Thus, we would have an almost vectorized approach (almost because we are getting the lengths of each element of x with a loop comprehension, but that isn't computationally intensive), like so -

lens = np.array([len(i) for i in x])
cut_idx = np.append(0,lens[:-1]).cumsum()
out = np.add.reduceat(np.vstack(x),cut_idx,axis=0)/lens[:,None]

Here's a sample run for an extended version of the sample input listed in the question -

In [89]: x = np.array([ ([[ 1.87918162,  1.12919822, -1.63856741],\
    ...:        [ 0.40560484,  0.96425656,  0.7847214 ],\
    ...:        [-0.83472207,  0.88918246, -0.83298299],\
    ...:        [-1.29211004,  0.71730071, -2.09109609],\
    ...:        [-1.65800248,  0.49154087,  0.14932455]]),\
    ...:  ([[ 1.87918162,  1.12919822, -1.63856741],\
    ...:        [-0.21786626, -0.23561859, -0.19750753],\
    ...:        [-0.83472207,  0.88918246, -0.83298299],\
    ...:        [-0.34967282,  0.51348973, -0.30882943],\
    ...:        [ 0.35654636, -0.64453956, -1.3066075 ],\
    ...:        [ 0.187328  , -1.32496725, -0.05783984]]),\
    ...: ([[ 1.87918162,  1.12919822, -1.63856741],\
    ...:        [-1.29211004,  0.71730071, -2.09109609],\
    ...:        [-1.65800248,  0.49154087,  0.14932455]])       
    ...:        ])

In [90]: np.mean(x[0], axis=0)
Out[90]: array([-0.30000963,  0.83829576, -0.72572011])

In [91]: np.mean(x[1], axis=0)
Out[91]: array([ 0.17013247,  0.0544575 , -0.72372245])

In [92]: np.mean(x[2], axis=0)
Out[92]: array([-0.35697697,  0.7793466 , -1.19344632])

In [93]: out
Out[93]: 
array([[-0.30000963,  0.83829576, -0.72572011],
       [ 0.17013247,  0.0544575 , -0.72372245],
       [-0.35697697,  0.7793466 , -1.19344632]])

Collectives™ on Stack Overflow

numpy - apply aggregation on each row of array

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related