6

I'm trying to write a function that performs a mathematical operation on an array and returns the result. A simplified example could be:

def original_func(A):
    return A[1:] + A[:-1]

For speed-up and to avoid allocating a new output array for each function call, I would like to have the output array as an argument, and alter it in place:

def inplace_func(A, out):
    out[:] = A[1:] + A[:-1]

However, when calling these two functions in the following manner,

A = numpy.random.rand(1000,1000)
out = numpy.empty((999,1000))

C = original_func(A)

inplace_func(A, out)

the original function seems to be twice as fast as the in-place function. How can this be explained? Shouldn't the in-place function be quicker since it doesn't have to allocate memory?

3 Answers 3

12

If you want to perform the operation in-place, do

def inplace_func(A, out):
    np.add(A[1:], A[:-1], out)

This does not create any temporaries (which A[1:] + A[:-1]) does.

All Numpy binary operations have corresponding functions, check the list here: http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs

Sign up to request clarification or add additional context in comments.

Comments

5

I think that the answer is the following:

In both cases, you compute A[1:] + A[:-1], and in both cases, you actually create an intermediate matrix.

What happens in the second case, though, is that you explicitly copy the whole big newly allocated array into a reserved memory. Copying such an array takes about the same time as the original operation, so you in fact double the time.

To sum-up, in the first case, you do:

compute A[1:] + A[:-1] (~10ms)

In the second case, you do

compute A[1:] + A[:-1] (~10ms)
copy the result into out (~10ms)

2 Comments

As for a solution: I think you'll have to do the loops yourself to avoid the intermediate arrays described in Olivier's answer. Or perhaps something like code.google.com/p/numexpr can help you? This question also looks relevant.
I think you can avoid the intermediate array by doing this: out[:]=A[1:]; out+=A[:-1] Of course your actual algo is probably going to be tougher to streamline. Of course try to avoid loops at all costs. There are often creative things you can do with accumulate and ufuncs..
-1

I agree with Olivers explanation. If you want to perform the operation inplace, you have to loop over your array manually. This will be much slower, but if you need speed you can resort to Cython which gives you the speed of a pure C implementation.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.