0

we are supposed to find a way to multiply a 2D array X of size (7403, 33) with its transpose

i mean this X* X.T

The solution is supposed to be 2.5 times faster than the np.dot(X,X.T). i have tried everything i can think of

%timeit np.dot(X,X.T)
%timeit np.matmul(X,X.T)
%timeit [email protected]
%timeit np.einsum("ij, jk -> ik",X,X.T)

and i have only acheived 1.5 times faster than the numpy dot

3.17 s ± 14.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.03 s ± 6.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.01 s ± 6.57 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.02 s ± 6.67 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
7
  • Your timeit results suggest that the last 3 call the same procedure. Why do you need it to be faster? Commented Nov 4, 2018 at 13:28
  • @roganjosh we are learning numpy. so it is our test. I have tried everything i can think of. any suggestions? Commented Nov 4, 2018 at 13:31
  • I think you should start doing some analysis how the result will look like. Multiplying a matrix with its transpose has an interesting structure. Commented Nov 4, 2018 at 13:32
  • @WillemVanOnsem i donot know how will that help. but looks like the determinant is 1. How does that help? Please explain? Commented Nov 4, 2018 at 13:41
  • if you guys could help. i would really appreciate it Commented Nov 4, 2018 at 14:03

1 Answer 1

2

Well i found the solution with scipy

%timeit np.dot(X,X.T)
%timeit np.matmul(X,X.T)
%timeit [email protected]
%timeit np.einsum("ij, jk -> ik",X,X.T)
%timeit linalg.blas.dgemm(alpha=1.0, a=X, b=X.T)

which gives

3.07 s ± 16.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.02 s ± 37.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.99 s ± 9.79 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2 s ± 5.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
306 ms ± 6.85 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Sign up to request clarification or add additional context in comments.

3 Comments

Can you explain why this answer is faster?
@hpaulj . i was hoping someone could explain this to mee. I believe it has got something to do with BLAS en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
I haven't dug into this. dot and so one tries to use BLAS or other standard libraries (np.show_config()). My guess is that the linalg.blas. ... call is faster simply because it's a more direct call; there's less overhead and checking. Read its docs.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.