Why NumPy arrays over standard library arrays?

Question

If I only need 1D arrays, what are the performance and size-in-memory benefits of using NumPy arrays over Python standard library arrays? Or are there any?

Let's say I have arrays of at least thousands of elements, and I want: fast direct access-by-index times and I want the smallest memory footprint possible. Is there a performance benefit to using this:

from numpy import array
a = array([1,2,3,4,5])

over this:

from array import array
a = array('i', [1,2,3,4,5])

Standard Python lists would have fast access-by-index times, but any array implementation will have a much smaller memory footprint. What is a decent compromise solution?

@ReblochonMasque Sorry, I thought it was clear that I was just showing an example. I am intersted in longer arrays. Question edited. — john_science
– john_science, Commented May 9, 2014 at 16:30
If you need to loop over your data structures you might want to look into cython. — Akavall
– Akavall, Commented May 9, 2014 at 17:04

Glorfindel · Accepted Answer · 2023-01-12 21:07:20Z

4

numpy is great for its fancy indexing, broadcasting, masking, flexible view on data in memory, many of its numerical methods and more. If you just want a container to hold data, then use an array.array or why not even a simple list?

I suggest taking a look at the numpy tutorial.

edited Jan 12, 2023 at 21:07

Glorfindel

22.8k13 gold badges97 silver badges124 bronze badges

answered May 9, 2014 at 16:28

Midnighter

3,9213 gold badges32 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

john_science Over a year ago

Well, looping over a Python list is not particularly fast. And my current application is scientific computing, where I will be doing a lot of looping and direct access-by-index. I have updated my question to be more clear.

DSM Over a year ago

Looping over array.array and numpy.ndarray instances is also slow. The whole point of using numpy arrays is that you can perform numeric operations on the whole array at once, which is both syntactically cleaner and offers major performance benefits because the loop is pushed down to C level.

Midnighter Over a year ago

Give this tutorial a read. Basically instead of looping over an array to perform an operation, numpy allows you to do this: a *= 2 to multiply each element by two in place. a * a multiply two vectors element-wise.

john_science Over a year ago

@DSM That is good to know. As that is not really my use-case. Thanks.

mgilson · Accepted Answer · 2014-05-09 16:29:37Z

4

this depends entirely on what you're planning on doing with the array.

>>> from array import array
>>> a = array('i', [1,2,3,4,5])
>>> a + a
array('i', [1, 2, 3, 4, 5, 1, 2, 3, 4, 5])

Note that the standard lib treats an array much more like a sequence which might not be desireable (or maybe it is ... Only you can decide that)

answered May 9, 2014 at 16:29

mgilson

312k70 gold badges656 silver badges722 bronze badges

2 Comments

john_science Over a year ago

I am mostly interested in performance speed of looping over reasonably large arrays. (For scientific computing purposes.)

Daniel Over a year ago

@theJollySin Numpy is mostly about calling C modules so that no loops are completed in python. As a generic statement: if you are looping over large arrays in python and care even the slightest about run time you are doing it wrong. If you need element wise control you should look into a lower level language.

Collectives™ on Stack Overflow

Why NumPy arrays over standard library arrays?

2 Answers 2

4 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related