1

I am trying to make array assignment in python, but it is very slow, is there any way to accelerate?

simi_matrix_img = np.zeros((len(annot), len(annot)), dtype='float16')
for i in range(len(annot)):
    for j in range(i + 1):
        score = 0
        times = 0
        if i != j:
            x_idx = [p1 for (p1, q1) in enumerate(annot[i]) if np.abs(q1 - 1) < 1e-5]
            y_idx = [p2 for (p2, q2) in enumerate(annot[j]) if np.abs(q2 - 1) < 1e-5]

            for idx in itertools.product(x_idx, y_idx):
                score += simi_matrix_word[idx]
                times += 1
            simi_matrix_img[i, j] = score/times
        else:
            simi_matrix_img[i, j] = 1.0

"annot" is a numpy array. Is there any way to accelerate it?

9
  • 1
    Why are you trying to make array assignment? This is a supported feature.. Commented Dec 30, 2015 at 3:28
  • I don't understand what the code is supposed to do. Could you show an example input and the corresponding desired output? Commented Dec 30, 2015 at 3:50
  • I want to get a similar matrix of images"simi_matrix_img",where each element of the matrix is computed by "annot"([[1,0,1..],[0,1,1,...]]) and "simi_matrix_word[[-0.5,0.2,...], [0.3, -0.2, ...]]"; firstly, I must find the position of "1" in every two vectors of annot, put them in "x_idx" and "y_idx", then compute similarity of these 2 vectors by finding their values in "simi_martrix_word", lastly, get the mean value of these values Commented Dec 30, 2015 at 6:05
  • What's the shape of annot? Commented Dec 30, 2015 at 6:35
  • 1
    Got to start some where. Vectorize the inner loops. Then you have a chance at doing the same for the outer ones. Commented Dec 30, 2015 at 13:00

2 Answers 2

1

I think the indent for this line is wrong:

            simi_matrix_img[i, j] = score/times

you want to perform that assignment after all the product iterations. But since it's the last assignment that takes, the results will be the same.

Here's a partial reworking of your code

def foo1(annot, simi_matrix_word):
    N = annot.shape[0]
    simi_matrix_img = np.zeros((N,N))
    for i in range(N):
        for j in range(i + 1):
            if i != j:
                x_idx = np.nonzero(annot[i])[0]
                y_idx = np.nonzero(annot[j])[0]
                idx = np.ix_(x_idx, y_idx)
                # print(idx, simi_matrix_word[idx])
                score = simi_matrix_word[idx].mean()
                simi_matrix_img[i, j] = score
            else:
                simi_matrix_img[i, j] = 1.0
    return simi_matrix_img

For a small test case, it returns the same thing:

annot=np.array([[1,0,1],[0,1,1]])
simi_matrix_word = np.arange(12, dtype=float).reshape(3,4)
[[ 1.  0.]
 [ 7.  1.]]

That gets rid of all the inner iterations. Next step would be reduce the outer iterations. For example start with np.eye(N), and just iterate on the lower tri indices:

In [169]: np.eye(2)
Out[169]: 
array([[ 1.,  0.],
       [ 0.,  1.]])

In [170]: np.tril_indices(2,-1)
Out[170]: (array([1]), array([0]))

Note that for a 2 row annot, we are only calculating one score, at [1,0].


Replacing nonzero with boolean indexing:

def foo3(annot, simi_matrix_word):
    N = annot.shape[0]
    A = annot.astype(bool)
    simi_matrix_img = np.eye(N,dtype=float)
    for i,j in zip(*np.tril_indices(N,-1)):
        score = simi_matrix_word[A[i],:][:,A[j]]
        simi_matrix_img[i, j] = score.mean()
    return simi_matrix_img

or this might speed up the indexing a bit:

def foo4(annot, simi_matrix_word):
    N = annot.shape[0]
    A = annot.astype(bool)
    simi_matrix_img = np.eye(N,dtype=float)
    for i in range(1,N):
        x = simi_matrix_word[A[i],:]
        for j in range(i):
            score = x[:,A[j]]
            simi_matrix_img[i, j] = score.mean()
    return simi_matrix_img

Since the number of nonzero values for each row of annot can differ, the number of terms that are summed for each score also differs. That strongly suggests that further vectorization is impossible.

Sign up to request clarification or add additional context in comments.

1 Comment

thank you! that's a amazing, it solved my problem and I updated the original code with correct indent. Is boolean indexing more efficient than np.nonzero?
1

(1) You could use generators instead of list comprehension where possible. For example:

        x_idx = (p1 for (p1, q1) in enumerate(annot[i]) if np.abs(q1 - 1) < 1e-5)
        y_idx = (p2 for (p2, q2) in enumerate(annot[j]) if np.abs(q2 - 1) < 1e-5)

With this, you iterate only once over those items (in for idx in itertools.product(x_idx, y_idx)), as opposed to twice (once for constructing the list then again in said for loop).

(2) What Python are you using? If <3, I have a hunch that a significant part of the problem is you're using range(), which can be expensive in connection with really large ranges (as I'm assuming you're using here). In Python 2.7, range() actually constructs lists (not so in Python 3), which can be an expensive operation. Try achieving the same result using a simple while loop. For example, instead of for i in range(len(annot)), do:

i=0
while i < len(annot):
    ... do stuff with i ...
    i += 1

(3) Why call len(annot) so many times? It doesn't seem like you're mutating annot. Although len(annot) is a fast O you could store the length in a var, e.g., annot_len = len(annot), and then just reference that. Wouldn't scrape much off though, I'm afraid.

3 Comments

Thank your for your help, (2) and (3) are very helpful, but I have a questions about (1): I get the position of "1" by x_idx = [p1 for (p1, q1) in enumerate(annot[i]) if np.abs(q1 - 1) < 1e-5] such as [0,3,5], so is y_idx [1,5,10], and then I want to get the combination of these 2 list by itertools.product(x_idx, y_idx), which will generate (0,1),(3,1),(5,1),(0,5),(3,5),(3,10)..., if I use generator, how can I get these combinations?
Looking at the docs, it seems that itertools.product() accepts any iterable, including generators such as the ones I've suggested above. So I believe all you need to do is change your list comprehensions to generator expressions and you should be good to go.
Thank you, generator improves the efficiency of inner loop, but I think improve the efficiency of outer loop is more significant

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.