2

I need to do some analysis on a large dataset using NumPy.

I have:

  1. One 50X1 Matrix (eigenvalues)
  2. One 50x50 Matrix (eigenvectors)

I need to be able to take each element of the eigenvalue matrix and multiply by column corresponding to the eigenvector.

So, multiply i-th element of array one, by i-th column of array 2, and so on for all i's.

Any ideas? :/

1
  • Maybe use numpy's multiply? Commented Feb 26, 2016 at 7:53

2 Answers 2

1

First convert 1-dim eigenvalue vector into diagonal matrix. Then, apply matrix multiplication.

import numpy as np
eigenval_diag = np.diag(eigenvalue_vec) # 50x50 matrix
result = eigenval_diag * eigen_matrix # 50x50 matrix
Sign up to request clarification or add additional context in comments.

2 Comments

Just a note, this is a lot slower than using broadcasting (see my answer) since it both needs to create the diagonal matrix in memory, and also perform matrix multiplication. It is a good, correct and very readable answer though.
The * in eigenval_diag * eigen_matrix IS NOT the matrix multiplication operator. If you use np.dot or the new @ operator, then you have a further problem, because premultiplying a generic matrix by a diagonal matrix you are multiplying the rows of the matrix by the corresponding diagonal elements, while the OP asks for column multiplication.
1

You can do this using the numpy broadcasting rules:

n = 4
A = np.random.randint(0, 10, size=(n,n))
B = np.array([1,0,2, 0])
B = B.reshape((1,n))
C = B * A

The multiplication is between a (1, n) and (n, n) matrix. To satisfy the broadcasting rule, the B matrix will be "extended" to a (n, n) array before the multiplication, which is then performed element-by-element as usual.

The above multiplication is equivalent to

BB = np.array([[1,0,2, 0],
               [1,0,2, 0],
               [1,0,2, 0],
               [1,0,2, 0]])
C = BB * A

but you never have to construct the matrix BB in memory.

Edit: Benchmarks

Since using a diagonal matrix might seem easier to read, I present the following quick benchmark you can try yourself.

# Setup data
n = 50
A = np.random.normal(size=(n,n))
B = np.random.normal(size=n)
B1 = B.reshape(1, 3)

# Make sure results are the same
C = np.dot(A, np.diag(B))
C1 = B1 * A
print np.allclose(C, C1) # Should be 'True'

# Bench with IPython
>>> %timeit np.dot(A, np.diag(B))
The slowest run took 7.44 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 36.7 µs per loop

>>> %timeit B1 * A
The slowest run took 10.27 times longer than the fastest. This could mean that an intermediate result is being cached 
100000 loops, best of 3: 6.64 µs per loop

I.e. for a 50x50 matrix, using broadcasting is in the order of 6 times as fast as using np.diag and matrix multiplication.

3 Comments

Sorry, I don't understand whats happening in your example. I'm not interested in the matrix multiplication between the two arrays but the product of each element in the 50x1 matrix, with 1 column and 1 column only in the 50x50 array Is there a way to call each element in the 50x1 array? A way to call each column in the second array? If so, I could then use a simple loop to sum all such products up?
I think the examples in the documentation I included in my answer is pretty good. Basically it is a feature where numpy will extend your arrays to fit the expression, if the broadcasting rules are held. It does this without copying the actual array, which means it is quite fast.
I have extended my answer somewhat

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.