Performance of returning a new array and modifying a passed-in array

Question

[Related]

In the snapshot below, I compare the speed of

modifying an existing array via slice assignment
just returning a new, modified array

It seems that the latter is faster. Why should this be the case?

EDIT: Updated with suggestions, and a version that uses numpy's vectorized add(), which is now the fastest.

enter image description here

Praveen · Accepted Answer · 2014-11-05 01:29:18Z

1

I don't know much about python/numpy internals, but here's what I assume is happening. By just looking at the code, I get the impression that finline is doing more work than freturn, since finline has all the statements that freturn does (x + 1.0) and more.

Maybe this explains what's going on:

>>> x = np.random.rand(N)
>>> y = np.zeros(N)
>>> super(np.ndarray, y).__repr__()
Out[33]: '<numpy.ndarray object at 0x24c9c80>'
>>> finline(x, y)
>>> y     # see that y was modified
Out[35]: 
array([ 1.92772158,  1.47729293,  1.96549695, ...,  1.37821499,
        1.8672971 ,  1.17013856])
>>> super(np.ndarray, y).__repr__()
Out[36]: '<numpy.ndarray object at 0x24c9c80>'  # address of y did not change
>>> y = freturn(x)
>>> super(np.ndarray, y).__repr__()
Out[38]: '<numpy.ndarray object at 0x24c9fc0>'  # address of y changed

So essentially, I think that finline is doing more work because it has to iterate over the elements of y and initialize each of them to the array returned by the x + 1.0 operation. On the other hand, y = freturn(x) probably just reinitializes the value of the y pointer to be equal to the address of the array initialized by the x + 1.0 operation.

edited Nov 5, 2014 at 1:29

answered Nov 5, 2014 at 1:09

Praveen

7,2723 gold badges46 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Caleb Hattingh Over a year ago

Ok, I think you're saying that the inline version is doing an extra copy, i.e. the RHS of the calculation produces an intermediate that gets assigned to the memory of inline-y as a separate step? Perhaps. I am looking at the output of dis.dis() now...

Roman L Over a year ago

@cjrh, yes, there will be an intermediate in this case. To avoid the intermediate you can do this: y[:] = x, y += 1.0. Is it faster?

Praveen Over a year ago

@cjrh: That's right. I assume that this is sort of like the difference between copy and move constructors in C++.

Caleb Hattingh Over a year ago

@RomanL: I added a version for your suggestion. It appears to be slower than the others. I also added a np.add() version, which is now the fastest.

Roman L Over a year ago

@cjrh: I don't see a reason why it would be slower, and it is faster on my machine. Of course, np.add is the way to go.

|

HYRY · Accepted Answer · 2014-11-05 01:39:19Z

0

x + 1 will create a new array.
y[:] = x + 1: create a new array and copy all the data to y
y = x + 1: create a new array and bind name y to this new array.
np.add(x, 1, out=y): don't create a new array, it's the fastest.

Here is the code:

x = np.zeros(1000000)
y = np.zeros_like(x)
%timeit x + 1
%timeit y[:] = x + 1
%timeit np.add(x, 1, out=y)

the output:

100 loops, best of 3: 4.2 ms per loop
100 loops, best of 3: 6.83 ms per loop
100 loops, best of 3: 2.5 ms per loop

answered Nov 5, 2014 at 1:39

HYRY

97.8k28 gold badges197 silver badges192 bronze badges

1 Comment

Caleb Hattingh Over a year ago

Thanks! I got there myself eventually, but have some points anyway :)

Collectives™ on Stack Overflow

Performance of returning a new array and modifying a passed-in array

2 Answers 2

7 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related