3

If I only need 1D arrays, what are the performance and size-in-memory benefits of using NumPy arrays over Python standard library arrays? Or are there any?

Let's say I have arrays of at least thousands of elements, and I want: fast direct access-by-index times and I want the smallest memory footprint possible. Is there a performance benefit to using this:

from numpy import array
a = array([1,2,3,4,5])

over this:

from array import array
a = array('i', [1,2,3,4,5])

Standard Python lists would have fast access-by-index times, but any array implementation will have a much smaller memory footprint. What is a decent compromise solution?

3
  • With a 5 elements array, probably not. Commented May 9, 2014 at 16:23
  • @ReblochonMasque Sorry, I thought it was clear that I was just showing an example. I am intersted in longer arrays. Question edited. Commented May 9, 2014 at 16:30
  • 1
    If you need to loop over your data structures you might want to look into cython. Commented May 9, 2014 at 17:04

2 Answers 2

4

numpy is great for its fancy indexing, broadcasting, masking, flexible view on data in memory, many of its numerical methods and more. If you just want a container to hold data, then use an array.array or why not even a simple list?

I suggest taking a look at the numpy tutorial.

Sign up to request clarification or add additional context in comments.

4 Comments

Well, looping over a Python list is not particularly fast. And my current application is scientific computing, where I will be doing a lot of looping and direct access-by-index. I have updated my question to be more clear.
Looping over array.array and numpy.ndarray instances is also slow. The whole point of using numpy arrays is that you can perform numeric operations on the whole array at once, which is both syntactically cleaner and offers major performance benefits because the loop is pushed down to C level.
Give this tutorial a read. Basically instead of looping over an array to perform an operation, numpy allows you to do this: a *= 2 to multiply each element by two in place. a * a multiply two vectors element-wise.
@DSM That is good to know. As that is not really my use-case. Thanks.
4

this depends entirely on what you're planning on doing with the array.

>>> from array import array
>>> a = array('i', [1,2,3,4,5])
>>> a + a
array('i', [1, 2, 3, 4, 5, 1, 2, 3, 4, 5])

Note that the standard lib treats an array much more like a sequence which might not be desireable (or maybe it is ... Only you can decide that)

2 Comments

I am mostly interested in performance speed of looping over reasonably large arrays. (For scientific computing purposes.)
@theJollySin Numpy is mostly about calling C modules so that no loops are completed in python. As a generic statement: if you are looping over large arrays in python and care even the slightest about run time you are doing it wrong. If you need element wise control you should look into a lower level language.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.