1

I'm looking for the fastest way to select the elements of a numpy array that satisfy several criteria. As an example, say I want to select all elements that lie between 0.2 and 0.8 from an array. I normally do something like this:

the_array = np.random.random(100000)
idx = (the_array > 0.2) * (the_array < 0.8)
selected_elements = the_array[idx]

However, this creates two additional arrays with the same size as the_array (one for the_array > 0.2 and one for the_array < 0.8). If the array is large, this can consume a lot of memory. Is there any way to get around this? All of the built-in numpy functions (such as logical_and) seem to do the this same thing under the hood.

5
  • 1
    sounds like you want the most memory efficient way, not the fastest way. Those two are often not the same. Commented Mar 27, 2014 at 11:30
  • 1
    Each of those boolean masks is only 1/8 the size of the array, if it is of doubles as in your example, so it normally is not a problem. If memory, not speed, is what you are concerned about, you could sort the array in place and then find the first and last index with calls to searchsorted. Commented Mar 27, 2014 at 11:55
  • Well, I care about both speed and memory efficiency. To me it seems that the most obvious implementation in a compiled language such as C would be to simply loop through the array, test each element and save the ones that pass the test. This should be both faster and more memory efficient than the example I posted above, which effectively has to loop through the array three times. What I'm looking for is some way to do that for a numpy array, but maybe that's just not possible. Commented Mar 27, 2014 at 12:35
  • If you want to do it in a single iteration, i think you would have to write your own Cython function for it. Should be fairly easy to do though. Commented Mar 27, 2014 at 12:45
  • numexpr looks pretty close to what I want. I'll have a look at that, thanks! Commented Mar 27, 2014 at 14:40

1 Answer 1

1

You could implement a custom C call for the select. The most basic way to do this is through a ctypes implementation.

select.c

int select(float lower, float upper, float* in, float* out, int n)
{
  int ii;
  int outcount = 0;
  float val;
  for (ii=0;ii<n;ii++)
    {
      val = in[ii];
      if ((val>lower) && (val<upper))
        {
          out[outcount] = val;
          outcount++;
        }
    }
  return outcount;
}

which is compiled as:

gcc -lm -shared select.c -o lib.so

And on the python side:

select.py

import ctypes as C
from numpy.ctypeslib import as_ctypes
import numpy as np

# open the library in python
lib = C.CDLL("./lib.so")

# explicitly tell ctypes the argument and return types of the function
pfloat = C.POINTER(C.c_float)
lib.select.argtypes = [C.c_float,C.c_float,pfloat,pfloat,C.c_int]
lib.select.restype = C.c_int

size = 1000000

# create numpy arrays
np_input  = np.random.random(size).astype(np.float32)
np_output = np.empty(size).astype(np.float32)

# expose the array contents to ctypes
ctypes_input = as_ctypes(np_input)
ctypes_output = as_ctypes(np_output)

# call the function and get the number of selected points
outcount = lib.select(0.2,0.8,ctypes_input,ctypes_output,size)

# select those points 
selected = np_output[:outcount]

Don't expect wild speedups with such a vanilla implementation, but in the C side you have the option of adding in OpenMP pragmas to get quick and dirty parallelism which may give you significant boosts.

Also as mentioned in the comments, numexpr may be a faster neater way to do all this in just a few lines.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.