0

I have a list of lists which are composed by bools, let's say l = [[False, False], [True, False]], and I need to convert l to a numpy array of arrays of booleans. I converted every sublist into a bool array, and the whole list to numpy array too. My current real list has a size of 121 sublists, and the result of np.any() throws just five results, not the 121 expected. My code is this:

    >>> result = np.array([ np.array(extracted[aindices[i]:aindices[i + 1]]) for i in range(len(aux_regions)) ])
    >>> np.any(result)
    [false, false, false, false, false]

extracted[aindices[i]:aindices[i + 1]] is the sublist which I convert to a bool array. The list generated in the whole line is converted to array too.

In the first example l the expected result is, for every subarray (asuming the list as converted) should be [False, True]

What's is the problem using np.any? or the data types for the converted list are not the right ones?

2
  • Why do you want an array of arrays instead of either a 2D array, or a list of arrays? An array of arrays tends to lead to confusion—anything you broadcast over it will generally treat each sub-array as just a Python object, not do anything numpily. Commented Aug 27, 2014 at 3:23
  • @abarnet Exactly, Which conversion should I use to get numpy.any() working as expected? Commented Aug 27, 2014 at 3:24

1 Answer 1

4

If you have a list of list of bools, you could skip numpy and use a simple comprehension:

In [1]: l = [[False, False], [True, False]]

In [2]: [any(subl) for subl in l]
Out[2]: [False, True]

If the sublists are all the same length, you can pass the list directly to np.array to get a numpy array of bools:

In [3]: import numpy as np

In [4]: result = np.array(l)

In [5]: result
Out[5]: 
array([[False, False],
       [ True, False]], dtype=bool)

Then you can use the any method on axis 1 to get the result for each row:

In [6]: result.any(axis=1)   # or `np.any(result, axis=1)`
Out[6]: array([False,  True], dtype=bool)

If the sublists are not all the same length, then a numpy array might not be the best data structure for this problem.


This part of my answer should be considered a "side bar" to what I wrote above. If the sublists have variable lengths, the list comprehension given above is my recommendation. The following is an alternative that uses an advanced numpy feature. I only suggest it because it looks like you already have the data structures needed to used numpy's reduceat function. It works without having to explicitly form the list of lists.

From reading your code, I infer the following:

  • extracted is a list of bools. You are splitting this up into sublists.
  • aindices is a list of integers. Each consecutive pair of integers in aindices specifies a range in extracted that is a sublist.
  • len(aux_regions) is the number of sublists; I'll call this n. The length of aindices is n+1, and the last value in aindices is the length of extracted.

For example, if the data looks like this:

In [74]: extracted
Out[74]: [False, True, False, False, False, False, True, True, True, True, False, False]

In [75]: aindices
Out[75]: [0, 3, 7, 10, 12]

it means there are four sublists:

In [76]: extracted[0:3]
Out[76]: [False, True, False]

In [77]: extracted[3:7]
Out[77]: [False, False, False, True]

In [78]: extracted[7:10]
Out[78]: [True, True, True]

In [79]: extracted[10:12]
Out[79]: [False, False]

With these data structures, you are set up to use the reduceat feature of numpy. The ufunc in this case is logical_or. You can compute the result with this one line:

In [80]: np.logical_or.reduceat(extracted, aindices[:-1])
Out[80]: array([ True,  True,  True, False], dtype=bool)
Sign up to request clarification or add additional context in comments.

3 Comments

if the lists of bools are not the same lenght, any fails? because my lists has many lenghts.
In that case, a numpy array might not be the right data structure to use.
Basically I will have to compare execution time. By now your very first approach is simple in code and data structure management, which makes me very happy haha. The code I'm developing, with your excellent answer led a improvement in time execution from a minute to 16 seconds (manipulating thousands of data)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.