0

For example, I have a list l.

l = np.array([[1,2], [3], [2,4]])

I hope to check which elements of the array contain 2.

By using a list comprehension, I can achieve it easily but not efficiently.

result = [2 in element for element in l]

Can I use numpy to get result more efficiently.

Thanks.

10
  • Could there be more than one occurance of the search number 2 in any of the lists? Commented Jun 23, 2017 at 7:40
  • Short answer: No. Look at the type of l: array([[1, 2], [3], [2, 4]], dtype=object). Numpy has no way of operating on object arrays the same way it would floats and ints. Commented Jun 23, 2017 at 7:41
  • Also, that is a list comprehension, not a generator. Commented Jun 23, 2017 at 7:42
  • 1
    @liuyihe But the sample in the question is only looking for one number 2. Could you post a more representative sample case? Commented Jun 23, 2017 at 8:16
  • 1
    @Divakar Sorry, I didn't express clearly. In fact you have solved my problem and do not need to worry about my comment above. What I need to do next is just: for item in items: in_eachlist(l, item) and assert len(items) == 1999. Thanks again. Commented Jun 23, 2017 at 11:19

1 Answer 1

1

Here's one approach -

def in_eachlist(l, search_num):
    mask = np.concatenate(l) == search_num
    lens = [len(i) for i in l]
    return np.logical_or.reduceat(mask,np.concatenate(([0], np.cumsum(lens[:-1]))) )

Basically, we are getting a 1D array from the input array of lists and comparing against the search number, giving us a mask. Then, we check if there's any True value within each interval with np.logical_or.reduceat (Thanks to @Daniel F on improvement here as I had used np.add.reduceat earlier and then looked for any sum > 1), giving us the desired output.

Sample run -

In [41]: l
Out[41]: array([[1, 2], [3], [2, 4]], dtype=object)

In [42]: in_eachlist(l,2)
Out[42]: array([ True, False,  True], dtype=bool)

In [43]: in_eachlist(l,3)
Out[43]: array([False,  True, False], dtype=bool)

In [44]: in_eachlist(l,4)
Out[44]: array([False, False,  True], dtype=bool)
Sign up to request clarification or add additional context in comments.

4 Comments

Why add.reduceat()>0 when you could do logical_or.reduceat()?
@DanielF Thanks, major improvement there! Edited.
Thanks! The operation time is reduced from 2.4s to 400ms.
@liuyihe, depending on the output you want after checking the 1999 values you mentioned, it is possible to make this even faster (thinking np.in1d, although @Divakar might have even fancier tricks in mind.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.