Create numPy array using list comprehension

Question

Let say I have two numPy arrays arr1and arr2:

arr1 = np.random.randint(3, size = 100)

arr2 = np.random.randint(3, size = 100)

I would like to build a matrix that contains the number of joint occurrences. In other words, for all the values of arr1 that are 0, find the elements in arr2 that are also 0 and are located at the same position. And so, I would like to get the following matrix:

M = [[p(0,0), p(0,1), p(0,2)],
     [p(1,0), p(1,1), p(1,2)],
     [p(2,0), p(2,1), p(2,2)]]

Where p(0,0)stands for the number of occurrences that are 0 on arr1and 0 on arr2.

First Attempt:

As a first attempt I have tried the following:

[[sum(arr1[arr2 == y] == x) for x in np.arange(0,3)] for y in np.arange(0,3)]

But python throws the following error:

NameError: name 'arr1' is not defined

Second Attempt:

I tried to dig into this error by making use of for-loops:

M = np.array([])

for x in np.arange(0,dim):
    result = np.array([])

    for y in np.arange(0,dim):
        result_temp = sum(arr1[arr2 == x] == y)
        result = np.append(result, result_temp)

    M = np.append(M,result)

In this case Python does not throw the previous Error, but instead of getting a 3x3 array, I get a 1x9 array, and I am not able to get the desired 3x3 array.

Thanks in advance.

unutbu · Accepted Answer · 2016-12-08 09:37:06Z

3

Your first list comprehension works. You won't get a NameError if arr1 is defined:

import numpy as np
np.random.seed(2016)
arr1 = np.random.randint(3, size = 100)
arr2 = np.random.randint(3, size = 100)
result = [[sum(arr1[arr2 == y] == x) for x in np.arange(0,3)] 
          for y in np.arange(0,3)] 
print(result)
# [[10, 9, 10], [8, 13, 15], [18, 8, 9]]

But you could instead use np.histogram2d:

result2, xedges, yedges = np.histogram2d(arr2, arr1, bins=range(4))
print(result2)

yields

[[ 10.   9.  10.]
 [  8.  13.  15.]
 [ 18.   8.   9.]]

answered Dec 8, 2016 at 9:37

unutbu

886k197 gold badges1.9k silver badges1.7k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Miquel Over a year ago

Thanks for you answer! I am going to use your second solution because although I have defined arr1, it still throws the error. I do not understand why.

Divakar · Accepted Answer · 2016-12-08 09:43:58Z

2

For performance, I would like to suggest np.bincount -

N = 3 # Number of integers to cover
out = np.bincount(arr2*N + arr1, minlength=N*N).reshape(N,N)

Sample run -

In [50]: arr1 = np.random.randint(3, size = 100)
    ...: arr2 = np.random.randint(3, size = 100)
    ...: 

In [51]: N = 3 # Number of integers to cover

In [52]: np.bincount(arr2*N + arr1, minlength=N*N).reshape(N,N)
Out[52]: 
array([[12, 10, 12],
       [ 7,  6, 20],
       [ 5, 13, 15]])

answered Dec 8, 2016 at 9:43

Divakar

222k19 gold badges273 silver badges374 bronze badges

Collectives™ on Stack Overflow

Create numPy array using list comprehension

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related