3

Imagine I have an n x d python array, e.g. a=np.array([[1,2,3],[4,5,6], [7,8,9], [10,11,12], [13,14,15]])

so in this case n=5, d=3 and imagine I have some number c which is smaller or equal than n and what I want to calculate is the following:

Consider every column independently and calculate the sum of every c values; e.g. if c=2, the solution would be

solution=np.array([[1+4, 2+5, 3+6], [7+10,8+11,9+12]])

The last row is skipped because 5 mod 2 = 1, so we need to leave out one line in the end;

If c=1, the solution would be the original array and if e.g. c=3 the solution would be

solution=np.array([[1+4+7, 2+5+8, 3+6+9]]), while the last two lines are omitted;

Now what would be the most elegant and efficient solution to do that? I have searched a lot online but could not find a similar problem

1 Answer 1

4

Here's one way -

def sum_in_blocks(a, c):
    # Get extent of each col for summing
    l = c*(len(a)//c)

    # Reshape to 3D considering first l rows, and "cutting" after each c rows
    # Then sum along second axis
    return a[:l].reshape(-1,c,a.shape[1]).sum(1)

More info on second step - General idea for nd to nd transformation.

Sample runs -

In [79]: a
Out[79]: 
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15]])

In [80]: sum_in_blocks(a, c=1)
Out[80]: 
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15]])

In [81]: sum_in_blocks(a, c=2)
Out[81]: 
array([[ 5,  7,  9],
       [17, 19, 21]])

In [82]: sum_in_blocks(a, c=3)
Out[82]: array([[12, 15, 18]])

Explanation with given sample

In [84]: a
Out[84]: 
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15]])

In [85]: c = 2

In [87]: l = c*(len(a)//c) # = 4; Get extent of each col for summing

In [89]: a[:l] # hence not relevant rows are skipped
Out[89]: 
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

# Reshape to 3D "cutting" after every c=2 rows
In [90]: a[:l].reshape(-1,c,a.shape[1])
Out[90]: 
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

# Sum along axis=1 for final o/p
In [91]: a[:l].reshape(-1,c,a.shape[1]).sum(axis=1)
Out[91]: 
array([[ 5,  7,  9],
       [17, 19, 21]])
Sign up to request clarification or add additional context in comments.

2 Comments

I have no idea what you are doing in the second step but I applied it to my data and it works like a charm.. Thanks
@Mark Added step-by-step explanation using a sample. That should make it easier to understand.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.