1

I require M = np.log(1+M) on a large 99.9% sparse 2D matrix.

How to perform this efficiently?

x, y = M.nonzero() will retrieve coord-pairs of nonzero elements, but can I vectorize a log operation over these pairs?

numpy doesn't seem to have sparse support.

1
  • scipy has a sparse module. But I wouldn't use it just for this log Commented Apr 21, 2021 at 7:04

1 Answer 1

2

This is simplest:

import numpy as np
import scipy.sparse as sps

M = sps.csr_matrix(M)

M.data += 1
M.data = np.log(M.data)

If it's particularly large you could also log it in place (this prevents the full copy above):

M.data += 1
M.data=np.log(M.data,out=M.data)

Both of these options work on dense matrices as well with minor changes - if your matrix is 99.9% sparse I would start using actual sparse data structures though.

You could also use the where argument on a dense array, but I doubt it would actually be any faster:

M = np.add(M, 1, out=M, where=M!=0)
M = np.log(M, out=M, where=M!=0)
Sign up to request clarification or add additional context in comments.

3 Comments

Maybe it is a good idea to also show the way how to perform the log in place: M.data=np.log(M.data,out=M.data) Chunking isn't necessary here.
@max9111 Oh, good idea - I didn't actually know you could give out the same array.
Note that numpy has log1p (log one plus)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.