Use numpy to create matrix based on conditional indices of an array

Question

I would like to create an n by m matrix based on elements of an n + m length array.

Here a simple double for loop suffices, but I wish for an expedient solution. The matrix will be relatively small.

n = 4
m = 6
s = n + m

array = np.arange(s)  # note: arange is only for example. real array varies.
matrix = np.zeros((n,m))

for i in range(n):
    for j in range (m):
        matrix[i,j] = array[i+j]

I have found that a comprehension is faster than the double for loops

matrix3 = [[array[i+j] for i in range(m)] for j in range(n)]

Is there a faster way?

An additional bonus would be to incorporate the modulo operator. I only actually need the indices where i+j % 2 == 0. In the double for loop the modulo method seems a little bit faster, but this may not be convenient or expedient for generating this matrix via numpy.

It is fine to not do this, since matrix multiplcation will occur after and the necessary elements will be multiplied by zero anyways. My mentioning the modulo is only in the off-case that this leads to a faster solution.

for this MWE

for i in range(n):
    for j in range (m):
        if (i + j) % 2 == 0:
            matrix[i,j] = array[i+j]

note:

I ask for a numpy solution under the assumption numpy will be fastest, but any pure python (including numpy/scipy) solution is fine so long as it is faster than pure python double for loops

motivation:

I am trying to remove all dependencies on arrays from a double for loop so that I can use broadcasting rather than a double for loop. This is the last array left

this does not seem to work. I don't see how it would know what shape to make it? the end result wouldn't be an n by m matrix would it? wouldn't this make an (n+m) by (n+m) matrix? — Charlie Crown
– Charlie Crown, Commented Jul 8, 2020 at 5:20

Paul Panzer · Accepted Answer · 2020-07-08 07:18:41Z

2

You can use advanced indexing into array. For efficiency, you can zero odd positions already in the template array.

np.where(np.arange(m+n)&1,0,array)[sum(np.ogrid[:n,:m])]
# array([[0, 0, 2, 0, 4, 0],
#        [0, 2, 0, 4, 0, 6],
#        [2, 0, 4, 0, 6, 0],
#        [0, 4, 0, 6, 0, 8]])

or (faster)

template = np.where(np.arange(m+n)&1,0,array)
np.lib.stride_tricks.as_strided(template,(n,m),2*template.strides)

This is a "compressed" view, if you need to modify the entries you must make a copy (it will still be faster).

edited Jul 8, 2020 at 7:18

answered Jul 8, 2020 at 6:12

Paul Panzer

53.3k3 gold badges59 silver badges103 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Paul Panzer Over a year ago

@bousof <strided_view>.copy() will create a contiguously layed out version of the array.

bousof · Accepted Answer · 2020-07-08 07:23:51Z

2

You can create a hankel matrix:

>>> from scipy.linalg import hankel
>>> matrix = hankel(array[0:n], array[n:s])
>>> matrix
array([[0, 1, 2, 3, 4, 6],
       [1, 2, 3, 4, 6, 7],
       [2, 3, 4, 6, 7, 8],
       [3, 4, 6, 7, 8, 9]])

If you absolutely want to set elements where (i+j)%2==1 to zero you can do (original post):

>>> matrix[::2, 1::2] = 0
>>> matrix[1::2, ::2] = 0
>>> matrix
array([[0, 0, 2, 0, 4, 0],
       [0, 2, 0, 4, 0, 7],
       [2, 0, 4, 0, 7, 0],
       [0, 4, 0, 7, 0, 9]])

You can also set every other value of array to zero then the constructed matrix will have zeros at desired locations:

>>> array[1::2]=0
>>> hankel(array[0:n], array[n:s])
array([[0, 0, 2, 0, 4, 6],
       [0, 2, 0, 4, 6, 0],
       [2, 0, 4, 6, 0, 8],
       [0, 4, 6, 0, 8, 0]])

edited Jul 8, 2020 at 7:23

answered Jul 8, 2020 at 5:41

bousof

1,25111 silver badges14 bronze badges

6 Comments

Paul Panzer Over a year ago

IIRC those special matrix functions while convenient are not exactly fast.

bousof Over a year ago

Yo Paul. If you continue to give me feedbaks about my answers I'l call you master! :) I think it shoud be fast but of course it depends on if the scipy.linalg.toeplitz was done by a trainee or not. Do you have some reference about the slowness of these functions?

Paul Panzer Over a year ago

See, for example stackoverflow.com/a/43735362/7207392 Glancing at the comments there it may be that scipy was patched and the functions are not slow anymore. One more remark: You can use hankel directly instead of flipping toeplitz

Paul Panzer Over a year ago

Even if you make copy in the end as_strided is at least faster than my fancy indexing method.

Paul Panzer Over a year ago

Ok, I read through the entire comment thread and the special matrices were indeed patched (and PR accepted into scipy) by @Divakar himself. So you should be good performance wise. Can't +1 because already did.

|

Valdi_Bo · Accepted Answer · 2020-07-08 05:35:23Z

1

Much simpler way to create your table is:

Define a function:

 def tVal(r, c):
     sm = r + c
     return np.where(sm % 2 == 0, sm, 0)

Use it as an argument of np.fromfunction:
```
 arr = np.fromfunction(tVal, (n, m))
```

For your target shape (6 * 4) the result is:

array([[0., 0., 2., 0., 4., 0.],
       [0., 2., 0., 4., 0., 6.],
       [2., 0., 4., 0., 6., 0.],
       [0., 4., 0., 6., 0., 8.]])

Note than tVal is not actually called separately for each array element. It is instead called only once, with 2 arrays (r and c) shaped as the target array, filled with respective arguments for each cell. So this function operates on these arrays (not on single values for each cell index).

This is why this function must contain where, not if for r and c values for particular cell.

And a remark concerning variable names: matrix is a class in Numpy (a subtype of ndarray), so it is a good practice not to use variables with the same name. Use rather other name, as I did in my example.

edited Jul 8, 2020 at 5:35

answered Jul 8, 2020 at 5:25

Valdi_Bo

31.1k4 gold badges29 silver badges45 bronze badges

2 Comments

Charlie Crown Over a year ago

This works for the case where the original array is generated as np.arange but this is not the real case, I only used that so that it was easy to tell what the final matrix should be. In general the values in the array vary widely. Can this be easily applied to that scenario?

Valdi_Bo Over a year ago

You can use fromfuncttion as long as you are able to write the funtion operating on 2 arrays filled with values (indices) for each element. In more complex case you must iterate over array elements. Read about nditer. It is an iterator operating on Numpy arrays, but allows also access to indices of the current element and write access to the element itself.

Serge Ballesta · Accepted Answer · 2020-07-08 05:41:26Z

1

I would do it directly at the numpy level:

matrix = np.arange(n * m).reshape(n,m)
matrix = matrix // m + matrix % m             # matrix // m is i and matrix % m is j

For n, m = 4, 6 it gives as expected:

array([[0, 1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5, 6],
       [2, 3, 4, 5, 6, 7],
       [3, 4, 5, 6, 7, 8]], dtype=int32)

answered Jul 8, 2020 at 5:41

Serge Ballesta

150k13 gold badges137 silver badges267 bronze badges

Comments

hpaulj · Accepted Answer · 2020-07-08 07:32:08Z

Your first example:

In [30]: arr=np.arange(24)                                                              
In [31]: [[arr[i+j] for i in range(6)] for j in range(4)]                               
Out[31]: 
[[0, 1, 2, 3, 4, 5],
 [1, 2, 3, 4, 5, 6],
 [2, 3, 4, 5, 6, 7],
 [3, 4, 5, 6, 7, 8]]

To take advantage of 'broadcasting':

In [32]: np.arange(4)[:,None]+np.arange(6)                                              
Out[32]: 
array([[0, 1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5, 6],
       [2, 3, 4, 5, 6, 7],
       [3, 4, 5, 6, 7, 8]])

The outer i loop is replaced by a (n,1) array; the inner j loop is replaced by the (m,) array; together the result is a (n,m) array.

Your more elaborate case:

In [35]: arr = np.arange(24) 
    ...: res = np.zeros((4,6),int) 
    ...: for i in range(4): 
    ...:     for j in range(6): 
    ...:         if (i+j)%2 ==0: 
    ...:             res[i,j] = arr[i+j] 
    ...:                                                                                
In [36]: res                                                                            
Out[36]: 
array([[0, 0, 2, 0, 4, 0],
       [0, 2, 0, 4, 0, 6],
       [2, 0, 4, 0, 6, 0],
       [0, 4, 0, 6, 0, 8]])

So this is the original, with just the even values set.

In [37]: Out[32]                                                                        
Out[37]: 
array([[0, 1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5, 6],
       [2, 3, 4, 5, 6, 7],
       [3, 4, 5, 6, 7, 8]])

Find the odds:

In [38]: Out[32]%2                                                                      
Out[38]: 
array([[0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0],
       [0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0]])

Multiply:

In [39]: Out[32]*(Out[32]%2==0)                                                         
Out[39]: 
array([[0, 0, 2, 0, 4, 0],
       [0, 2, 0, 4, 0, 6],
       [2, 0, 4, 0, 6, 0],
       [0, 4, 0, 6, 0, 8]])

In general to make optimal use of numpy, I try to see overall patterns. That's where small examples are especially valuable.

Collectives™ on Stack Overflow

Use numpy to create matrix based on conditional indices of an array

note:

motivation:

5 Answers 5

1 Comment

6 Comments

2 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

note:

motivation:

5 Answers 5

1 Comment

6 Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related