Why is numpy list access slower than vanilla python?

Question

I was under the impression that numpy would be faster for list operations, but the following example seems to indicate otherwise:

import numpy as np
import time

def ver1():
    a = [i for i in range(40)]
    b = [0 for i in range(40)]
    for i in range(1000000):
        for j in range(40):
            b[j]=a[j]

def ver2():
    a = np.array([i for i in range(40)])
    b = np.array([0 for i in range(40)])
    for i in range(1000000):
        for j in range(40):
            b[j]=a[j]

t0 = time.time()
ver1()
t1 = time.time()
ver2()
t2 = time.time()

print(t1-t0)
print(t2-t1)

Output is:

4.872278928756714
9.120521068572998

(I'm running 64-bit Python 3.4.3 in Windows 7, on an i7 920)

I do understand that this isn't the fastest way to copy a list, but I'm trying to find out if I'm using numpy incorrectly. Or is it the case that numpy is slower for this kind of operation and is only more efficient in more complex operations?

EDIT:

I also tried the following, which just just does a direct copy via b[:] = a, and numpy is still twice as slow:

import numpy as np
import time

def ver6():
    a = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
    b = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
    for i in range(1000000):
        b[:] = a

def ver7():
    a = np.array([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])
    b = np.array([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])
    for i in range(1000000):
        b[:] = a

t0 = time.time()
ver6()
t1 = time.time()
ver7()
t2 = time.time()

print(t1-t0)
print(t2-t1)

Output is:

0.36202096939086914
0.6750380992889404

NumPy newbie rule of thumb: If your code has the word for in it, you're not getting the benefits of NumPy there. — user2357112
– user2357112, Commented Jan 26, 2016 at 18:10
Constructing the numpy arrays takes some time. After they're constructed, however, further operations are much quicker than using a vanilla Python list. Since you are constructing two new numpy arrays in every loop iteration, it only makes sense that it would be much slower than using Python lists. — pzp
– pzp, Commented Jan 26, 2016 at 18:20
@pzp and for thinking exactly that reason, I changed the code to construct the arrays outside of the functions, reduced it to a single function and timed it without that factor. Still the same. — roganjosh
– roganjosh, Commented Jan 26, 2016 at 18:23
@roganjosh Are you sure? I just timed it myself without including the array construction and with user2357112's proper use of numpy, and numpy killed vanilla--it was not even close. Also, this is a terribly constructed test... the results are being cached. — pzp
– pzp, Commented Jan 26, 2016 at 19:26

user2357112 · Accepted Answer · 2016-01-26 18:07:59Z

6

You're using NumPy wrong. NumPy's efficiency relies on doing as much work as possible in C-level loops instead of interpreted code. When you do

for j in range(40):
    b[j]=a[j]

That's an interpreted loop, with all the intrinsic interpreter overhead and more, because NumPy's indexing logic is way more complex than list indexing, and NumPy needs to create a new element wrapper object on every element retrieval. You're not getting any of the benefits of NumPy when you write code like this.

You need to write the code in such a way that the work happens in C:

b[:] = a

This would also improve the efficiency of the list operation, but it's much more important for NumPy.

answered Jan 26, 2016 at 18:07

user2357112

286k32 gold badges490 silver badges569 bronze badges

Sign up to request clarification or add additional context in comments.

17 Comments

user2357112 Over a year ago

@L3viathan: That wouldn't help; in fact, it'd be outright wrong. Really, the arrays should be np.arange(40) and numpy.zeros([40]).

CaptainCodeman Over a year ago

Hi, I tried it with b[:] = a, the vanilla python is still more than twice as fast as numpy.

user2357112 Over a year ago

@CaptainCodeman: That's from a combination of 3 factors: the inputs are fairly small, very little allocation is involved, and the Python list gets to push the work into C too. If you try it with larger arrays, or if you try a mathematical operation (say, elementwise addition), the NumPy array will be way faster.

user2357112 Over a year ago

@CaptainCodeman: Depends on how much math you're doing, and how well you take advantage of NumPy's features when you're doing that math. Even for arrays of this size, NumPy is way faster than Python built-in data types for math.

user2357112 Over a year ago

@roganjosh: It's worth noting that for actual math, NumPy starts winning at a much lower array length. For example, a+b with arrays beats [x+y for x, y in zip(a, b)] for lists at a length of about 10, and numpy.log(a) beats [math.log(x) for x in a] at a length of about 7.

|

Jaime · Accepted Answer · 2016-01-26 19:23:32Z

Most of what you are seeing is Python object creation from C native types.

A Python list is, at it's heart, an array of PyObject pointers. When a and b are both Python lists, doing b[i] = a[i] will imply:

decreasing the reference count of the object pointed by b[i],
increasing the reference count of the object pointed by a[i], and
copying the address stored in a[i] into b[i].

But if a and b are NumPy arrays, things are a little more ellaborate, and the same b[i] = a[i] then requires:

creating a Python integer object from the native C integer type stored at a[i], see this,
converting the Python integer object into a native C integer type, and storing its value in b[i], see here, and
decreasing the reference count of the temporary Python integer object.

So the difference is mostly in creating and disposing of that intermediate Python object, that lists do not need to do.

Collectives™ on Stack Overflow

Why is numpy list access slower than vanilla python?

2 Answers 2

17 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

17 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related