12

I have two equally sized numpy arrays (they happen to be 48x365) where every element is either -1, 0, or 1. I want to compare the two and see how many times they are both the same and how many times they are different while discounting all the times where at least one of the arrays has a zero as no data. For instance:

for x in range(48):
    for y in range(365):
        if array1[x][y] != 0:
            if array2[x][y] != 0:
                if array1[x][y] == array2[x][y]:
                    score = score + 1
                else:
                    score = score - 1
return score

This takes a very long time. I was thinking to take advantage of the fact that multiplying the elements together and summing all the answers may give the same outcome, and I'm looking for a special numpy function to help with that. I'm not really sure what unusual numpy function are out there.

4 Answers 4

12

Simpy do not iterate. Iterating over a numpy array defeats the purpose of using the tool.

ans = np.logical_and(
    np.logical_and(array1 != 0, array2 != 0),
    array1 == array2 )

should give the correct solution.

Sign up to request clarification or add additional context in comments.

2 Comments

Good idea! But this gives me a boolean array. I still need to sum up all the True's to get a score. Is there a numpy-thonic way to do that?
you can also use np.sum(array1[ans]) or np.sum(array2[ans]) if you want sum by itself. everytime you have a false as an entry it will not take the value into account.
7

For me the easiest way is to do this :

A = numpy.array()
B = numpy.array()

T = A - B
max = numpy.max(numpy.abs(T))

epsilon = 1e-6
if max > epsilon:
    raise Exception("Not matching arrays")

It allow to know quickly if arrays are the same and allow to compare float values !!

1 Comment

A bit more general solution than the OP asked for but very useful indeed!
1

Simple calculations along the following lines, will help you to select the most suitable way to handle your case:

In []: A, B= randint(-1, 2, size= (48, 365)), randint(-1, 2, size= (48, 365))
In []: ignore= (0== A)| (0== B)
In []: valid= ~ignore

In []: (A[valid]== B[valid]).sum()
Out[]: 3841
In []: (A[valid]!= B[valid]).sum()
Out[]: 3849
In []: ignore.sum()
Out[]: 9830

Ensuring that the calculations are valid:

In []: 3841+ 3849+ 9830== 48* 365
Out[]: True

Therefore your score (with these random values) would be:

In []: a, b= A[valid], B[valid]
In []: score= (a== b).sum()- (a!= b).sum()
In []: score
Out[]: -8

Comments

0
import numpy as np

A = np.array()
B = np.array()
...
Z = np.array()

to_test = np.array([A, B, .., Z])

# compare linewise if all lines are equal 
np.all(map(lambda x: np.all(x==to_test[0,:]), to_test[1:,:]))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.