1

I have a large numpy array already allocated of a given size. For example

my_array = numpy.empty(10000, numpy.float)

The values for the array can be generated by (mock example)

k * val ** 2 for val in range(0, 10000)

This step of setting the values of the array is done many times. For example, for k in range(0,1000). I don't want to do any other allocation than the one done by the numpy.empty() at the beginning.

I have considered,

my_array = numpy.array([k*val**2 for val in range(0,10000)])

but this looks like there is going to be at least the allocation of the list [k * val ** 2 for val in range(0, 10000)]. Is that right?

I saw also numpy.fromiter, but this seem to be for constructing the array.

my_array = numpy.fromiter((k*val**2 for val in range(0,10000)), numpy.float, 10000)

Is it true that there is one further allocation here?


To see if numpy.fromiter allocates an array I tried the following

import numpy as np

iterable1 = (x*x for x in range(5))
iterable2 = (x*x + 1.0 for x in range(5))
my_array = np.fromiter(iterable1, np.float)
print(my_array)
print(hex(id(my_array)))

my_array = np.fromiter(iterable2, np.float)
print(my_array)
print(hex(id(my_array)))

In the output I the two addresses printed are different. Doesn't this mean that np.fromiter allocated a new array which got then assigned to my_array?

13
  • np.fromiter doesn't do any further allocation. That's the whole essence of that function. Also, you don't need to use np.empty if you want to change all of the items at once. Commented Nov 30, 2016 at 23:06
  • @Kasramvd Are you sure about that? I just don't know. The documentation of the fromiter says that it creates an array. I assumed that it creates a numpy array and then that array gets moved by the operator = to the my_array. But if you know for a fact that no new allocation is done I will believe you. Commented Nov 30, 2016 at 23:09
  • If you must support an arbitrary iterator and don't want any temporary allocation, it will be hard to avoid the simplest for ind, elem in enumerate(iterable): my_array[ind] = elem. Commented Nov 30, 2016 at 23:11
  • @onekeystrokeatatime Yes, that's exactly what this function does. It creates an array from an iterable and assigns it to the target variable. If you are looking for a way to get rid of this, check out my answer. Commented Nov 30, 2016 at 23:15
  • "I don't want to do any other allocation than the one done by the numpy.empty() at the beginning." - you're coming at this from a C++ perspective, where array allocation is expensive and must be avoided. This is Python. Array allocation is nothing compared to the expenses of JIT-less bytecode interpretation, dynamic dispatch, and individually-allocated 24-byte int objects. Commented Nov 30, 2016 at 23:32

4 Answers 4

3

Given the explanation in the comments, it seems that the problem is the following:

  • A large array needs to be updated frequently, and as efficiently as possible;
  • Source of the updates are not only other numpy arrays, but arbitrary Python objects (which can generate on-the-fly).

The second item is the problem: as long as your values come from Python, putting them into a numpy array will never be really efficient. This is because you have to loop over each value in interpreted code.

I was expecting to find the expression for ind, elem in enumerate(iterable): my_array[ind] = elem already packaged in a built in function. Do you know if the Python interpreter compiles that expression as a whole?

CPython's virtual machine is very different from the C++ model; specifically, the compiler cannot inline the expression or interpret it as a whole in a way to make it significantly more efficient. Even if it supported a byte-code instruction that did this one specific thing in C, it would still need to call the generator's next method that produces each value as heap-allocated Python object after executing Python byte-code. In either case, interpreted code is involved for every iteration, and you really want to avoid that.

The efficient way to approach your problem is to design it from the ground up to never leave numpy. As others explained in the comment, the cost of allocation (if done efficiently, by numpy) is miniscule compared to the cost of actually working with data piece by piece in Python. I would design it as follows:

  • Convert as much code to natively work with numpy arrays, from the ground up; make returning a numpy array part of your interface and don't worry about allocation costs. Do as many loops as possible within numpy itself, so they are done in native code. Never iterate through all values of large arrays in Python.
  • Where it is not possible to use numpy, use numpy.fromiter to convert the iterator to the numpy array as early as possible.
  • Use either my_array[:] = new_array[:] or my_array = new_array to introduce the new values into the array. (The former will take microscopically more time, but it makes more sense when my_array is shared in many places in the data model.)
  • Benchmark operations you are interested in. Don't assume that "copying is slow" - it might turn out that operations that would in C++ be "slow" turn out orders of magnitude faster than the Python rendition of operations that would be efficient in C++.

If after doing the above some operation is not supported by numpy, and the measurements show it to be critically inefficient, you can use the Python/C API to create an extension module that performs the computation efficiently and returns the result as a numpy array created in C.

Sign up to request clarification or add additional context in comments.

Comments

2

First make sure you understand what the variable assignment does:

 my_array = numpy.empty(10000, numpy.float)
 my_array = numpy.fromiter(...)

the second assignment replaces the first. The object that the my_array originally referenced is free and gets garbage collected. That's just basic Python variable handling. To hang on to the original array (a mutable object), you have to change its values,

my_array[:] = <new values>

But the process that generates <new values> will, more than likely, create a temporary buffer (or two or three). Those values are then copied to the target. Even x += 1 does a buffered calculation. There are few in-place numpy operations.

Generally trying to second guess numpy's memory allocation doesn't work. Efficiency can only be measured by time tests, not by guessing what is happening under the covers.

Don't bother with 'pre-allocation' unless you need to fill it iteratively:

In [284]: my_array = np.empty(10, int)
In [285]: for i in range(my_array.shape[0]):
     ...:     my_array[i] = 2*i+3
In [286]: my_array
Out[286]: array([ 3,  5,  7,  9, 11, 13, 15, 17, 19, 21])

Which is a horrible way of creating an array compared to:

In [288]: np.arange(10)*2+3
Out[288]: array([ 3,  5,  7,  9, 11, 13, 15, 17, 19, 21])

the fromiter approach is prettier but not faster.

In [290]: np.fromiter((i*2+3 for i in range(10)),int)
Out[290]: array([ 3,  5,  7,  9, 11, 13, 15, 17, 19, 21])

Some timings:

In [292]: timeit np.fromiter((i*2+3 for i in range(10000)),int)
100 loops, best of 3: 4.76 ms per loop
# giving a count drops the time to 4.28 ms

In [293]: timeit np.arange(10000)*2+3
The slowest run took 8.73 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 47.4 µs per loop

In [294]: %%timeit 
     ...: my_array=np.empty(10000,int)
     ...: for i in range(my_array.shape[0]):
     ...:     my_array[i] = 2*i+3
     ...:     
100 loops, best of 3: 4.72 ms per loop

In [303]: timeit np.array([i*2+3 for i in range(10000)],int)
100 loops, best of 3: 4.48 ms per loop

fromiter takes just as long as an explicit loop, while the pure numpy solution is orders of magnitude faster. Timewise there is little difference between np.array with a list comprehension and fromiter with the generator.

Creating the array from a pre-existing list takes about 1/3 the time.

In [311]: %%timeit alist=[i*2+3 for i in range(10000)]
     ...: x=np.array(alist, int)
     ...: 
1000 loops, best of 3: 1.63 ms per loop

Assigning a list to an existing empty array isn't faster.

In [315]: %%timeit alist=[i*2+3 for i in range(10000)]
     ...: arr = np.empty(10000,int)
     ...: arr[:] = alist
1000 loops, best of 3: 1.65 ms per loop
In [316]: %%timeit alist=[i*2+3 for i in range(10000)]; arr=np.empty(10000,int)
     ...: arr[:] = alist
1000 loops, best of 3: 1.63 ms per loop

There are some numpy functions that take an out parameter. You may save some time by reusing an array that way. np.cross is one function that takes advantage of this (the code is Python and readable).

Another 'vectorized' way of creating values from a scalar function:

In [310]: %%timeit f=np.frompyfunc(lambda i: i*2+3,1,1)
     ...: f(range(10000))
     ...: 
100 loops, best of 3: 8.31 ms per loop

Comments

1

np.fromiter doesn't do any further allocation. It just creates an array from an iterable. That's the whole essence of that function. It also accepts a count argument which allows fromiter to pre-allocate the output array, instead of resizing it on demand.

Also, you don't need to use np.empty if you want to change all of the items at once.

After all, if you are constructing your new array by doing some mathematical operations on a range of numbers you can simply do the operations on a Numpy array as well.

Here is an example:

In [4]: a = np.arange(10)

In [6]: a**2 + 10
Out[6]: array([10, 11, 14, 19, 26, 35, 46, 59, 74, 91])

8 Comments

np.fromiter does a new allocation, it allocates the array it returns. The OP has made it clear that he wants to set the elements of the same array (allocated once at the start) many times, with contents each time coming from an iterator.
@user4815162342 By allocation I meant it just creates the new array from the iterator and not caching the items in memory and converting them to a numpy array. It's gets an iterable and converts it to a numpy array.
I am not sure that what you are suggesting is accurate. I tried this code import numpy as np iterable1 = (x*x for x in range(5)) iterable2 = (x*x + 1.0 for x in range(5)) my_array = np.fromiter(iterable1, np.float) print(my_array) print(hex(id(my_array))) my_array = np.fromiter(iterable2, np.float) print(my_array) print(hex(id(my_array))) and it looks like the address of my_array changed.
Opps. Python in a comment looses the alignment.
@onekeystrokeatatime As I said the formiter() directly creates a numpy array from an iterable, if you don't want this you should do the operations directly on your array.
|
0

I would like to give another answer from the perspective of spatial efficiency, as it may be relevant to others looking up this question.

Python lists have a significant overhead which can create problems when working with larger datasets and limited RAM. As a dummy example, a simple 10x10^6 element array, [[0 for _ in range(10)] for _ in range(10 ** 6)], takes 200 MB of RAM, whereas its theoretical size, assuming single precision floats, would be of about 40 MB, and its size as a numpy float array is 76.5 MB.

Additionally, having both the Python list and the numpy array in memory at the same time, during conversion, can be a limiting factor.

The ideal solution to avoid putting a large Python list in memory is fromiter(), however, in numpy 1.19 fromiter does not support for the iterable element type to be an array, so you cannot build a 2D+ matrix. (this is relevant to users of Tensorflow 2.5, which at present is incompatible with newer versions of numpy)

A viable manual solution is to create the final matrix separately and to overwrite its values from the iterator.

1 Comment

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.