2

Is there a more efficient way than using numpy.asarray() to generate an array from output in the form of a list?

This appears to be copying everything in memory, which doesn't seem like it would be that efficient with very large arrays.

(Updated) Example:

import numpy as np
a1 = np.array([1,2,3,4,5,6,7,8,9,10]) # pretend this has thousands of elements
a2 = np.array([3,7,8])

results = np.asarray([np.amax(np.where(a1 > element)) for element in a2])
2
  • Your example doesn't seem to make much sense. Unless element is larger then all elements in a1, it is just the largest element of a1. In any case the approach will scale very badly for large a1 for this kind of function, so what are you actually doing here? Also np.frompyfunc doesn't really speed up things. Note that the copying here should be very unimportant compared to the actual work done, trying to optimize things without knowing where your time is spend is usually a bad idea... Commented Dec 14, 2012 at 13:03
  • This was just for demonstration. What I'm trying to do is create arrays of values meeting some condition against a set of values in another array. Your point about optimization is a good one. I was just curious to find a best practice in this instance I suppose. Thanks! Commented Dec 15, 2012 at 22:14

2 Answers 2

5

I usually use np.fromiter:

results = np.fromiter((np.amax(np.amax(np.where(a1 > element)) for element in a2), dtype=int, count=len(a2))

You don't need to specify count but it allows numpy to preallocate the array. Here are some timings I did on https://www.pythonanywhere.com/try-ipython/:

In [8]: %timeit np.asarray([np.amax(np.where(a1 > element)) for element in a2])                                 
1000 loops, best of 3: 161 us per loop

In [10]: %timeit np.frompyfunc(lambda element: np.amax(np.where(a1 > element)),1,1)(a2,out=np.empty_like(a2))   
10000 loops, best of 3: 123 us per loop

In [13]: %timeit np.fromiter((np.amax(np.where(a1 > element)) for element in a2),dtype=int, count=len(a2))
10000 loops, best of 3: 111 us per loop
Sign up to request clarification or add additional context in comments.

Comments

1

np.vectorize won't work the way you want, because it doesn't respect an out parameter. However, the lower-level np.frompyfunc will:

np.frompyfunc(lambda element: np.amax(np.where(a1 > element)),
              1, 1)(a2, out=np.empty_like(a2))

1 Comment

I hadn't heard of frompyfunc. This looks great. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.