What is the fastest way to multiply two 2D numpy arrays?

Question

we are supposed to find a way to multiply a 2D array X of size (7403, 33) with its transpose

i mean this X* X.T

The solution is supposed to be 2.5 times faster than the np.dot(X,X.T). i have tried everything i can think of

%timeit np.dot(X,X.T)
%timeit np.matmul(X,X.T)
%timeit [email protected]
%timeit np.einsum("ij, jk -> ik",X,X.T)

and i have only acheived 1.5 times faster than the numpy dot

3.17 s ± 14.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.03 s ± 6.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.01 s ± 6.57 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.02 s ± 6.67 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Your timeit results suggest that the last 3 call the same procedure. Why do you need it to be faster? — roganjosh
– roganjosh, Commented Nov 4, 2018 at 13:28
@roganjosh we are learning numpy. so it is our test. I have tried everything i can think of. any suggestions? — Amit
– Amit, Commented Nov 4, 2018 at 13:31
I think you should start doing some analysis how the result will look like. Multiplying a matrix with its transpose has an interesting structure. — willeM_ Van Onsem
– willeM_ Van Onsem, Commented Nov 4, 2018 at 13:32
@WillemVanOnsem i donot know how will that help. but looks like the determinant is 1. How does that help? Please explain? — Amit
– Amit, Commented Nov 4, 2018 at 13:41

Amit · Accepted Answer · 2018-11-04 14:18:37Z

2

Well i found the solution with scipy

%timeit np.dot(X,X.T)
%timeit np.matmul(X,X.T)
%timeit [email protected]
%timeit np.einsum("ij, jk -> ik",X,X.T)
%timeit linalg.blas.dgemm(alpha=1.0, a=X, b=X.T)

which gives

3.07 s ± 16.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.02 s ± 37.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.99 s ± 9.79 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2 s ± 5.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
306 ms ± 6.85 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

answered Nov 4, 2018 at 14:18

Amit

1,0419 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

hpaulj Over a year ago

Can you explain why this answer is faster?

Amit Over a year ago

@hpaulj . i was hoping someone could explain this to mee. I believe it has got something to do with BLAS en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms

hpaulj Over a year ago

I haven't dug into this. dot and so one tries to use BLAS or other standard libraries (np.show_config()). My guess is that the linalg.blas. ... call is faster simply because it's a more direct call; there's less overhead and checking. Read its docs.

Collectives™ on Stack Overflow

What is the fastest way to multiply two 2D numpy arrays?

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related