4

Looking to make the this calculation as quickly as possible. I have X as n x m numpy array. I want to define Y to be the following:

Y_11 = 1 / (exp(X_11-X_11) + exp(X_11-X_12) + ... exp(X_11 - X_1N) ).

or for Y_00

1/np.sum(np.exp(X[0,0]-X[0,:]))

So basically, Y is also n x m where the i,j element is the 1 / sum_j' exp(X_ij - X_ij')

Any tips would be great! Thanks.

Sample code as requested:

np.random.seed(111)
J,K = 110,120
X = np.random.rand(J,K)
Y = np.zeros((J,K))
for j in range(J):
    for k in range(K):
        Y[j,k] = 1/np.sum(np.exp(X[j,k]-X[j,:]))

# note each row will sum to 1 under this operation
np.sum(Y,axis=1)
3
  • 1
    Where did the exp calls go in your second expression? Commented Sep 19, 2016 at 17:46
  • Thanks for spotting that I dropped the exp(). I also added some code. Commented Sep 19, 2016 at 17:52
  • This is now a good question. In the future, it would be great if you could write stuff upfront like it is now. Commented Sep 19, 2016 at 18:35

2 Answers 2

6

Here are few fully vectorized approaches -

def vectorized_app1(X):
    return 1/np.exp(X[:,None] - X[...,None]).sum(1)

def vectorized_app2(X):
    exp_vals = np.exp(X)
    return 1/(exp_vals[:,None]/exp_vals[...,None]).sum(1)

def vectorized_app3(X):
    exp_vals = np.exp(X)
    return 1/np.einsum('ij,ik->ij',exp_vals,1/exp_vals)

Tricks used and lessons learnt

  • Extend dimensions with None/np.newaxis to bring in broadcasting and perform everything in a vectorized manner.

  • np.exp is an expensive operation. So, using it on broadcasted huge array would be costly. So, the trick used is : exp(A-B) = exp(A)/exp(B). Thus, we perform np.exp(X) upfront and then perform broadcasted divisions.

  • Those sum-reductions could be implemented with np.einsum. This brings in memory-efficiency as we won't have to create huge broadcasted arrays and this effects in massive performance boost.

Runtime test -

In [111]: # Setup input, X
     ...: np.random.seed(111)
     ...: J,K = 110,120
     ...: X = np.random.rand(J,K)
     ...: 

In [112]: %timeit original_approach(X)
1 loop, best of 3: 322 ms per loop

In [113]: %timeit vectorized_app1(X)
10 loops, best of 3: 124 ms per loop

In [114]: %timeit vectorized_app2(X)
100 loops, best of 3: 14.6 ms per loop

In [115]: %timeit vectorized_app3(X)
100 loops, best of 3: 3.01 ms per loop

Looks like einsum showed its magic again with 100x+ speedup!

Sign up to request clarification or add additional context in comments.

5 Comments

Very useful answer. Thanks for all the explanations.
Thank you! One issue however is sometimes A or B has large values so I get an error. My thought about going with exp(A-B) is that I shrink a lot of them and can put in an np.min(A-B, 200) or similar to avoid errors. Any other ideas?
@Kevin What exactly is the error? What does it say?
overflow in exp. I just discovered bigfloat but will need to spend time seeing how to implement with my code -- and see if it works with knitro.
@Kevin Hmm or I guess you can stick to vectorized_app1 for those cases.
2

Here's a first step at reducing the double loop:

def foo2(X):
    Y = np.zeros_like(X)
    for k in range(X.shape[1]):
        Y[:,k]=1/np.exp(X[:,[k]]-X[:,:]).sum(axis=1)
    return Y

I suspect I can also remove the k loop, but I have to spend more time figuring out just what it is doing. That X[:,[k]]-X[:,:] isn't obvious (in the big picture).

Another step:

Z = np.stack([X[:,[k]]-X for k in range(X.shape[1])],2)
Y = 1/np.exp(Z).sum(axis=1)

which can be further refined (with too much trial and error) with

Z = X[:,None,:]-X[:,:,None]

2 Comments

Great answer. Very minor comment: in your last parentheses, did you possibly mean "without too much trial and error".
No, I tried a number of alternatives. There are various ways in which X can be expanded to 3d. And 3 axis that it can be summed on. I wasn't as systematic as I could have been.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.