0

I have a numpy ndarray as below. I want to filter rows, cols where the fourth coordinate is not 1. i.e, where ndarary[0][0][-1] != 1

>>> print(ndarray)
array([[[0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        ...,
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1]],

       [[0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        ...,
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1]],

       [[0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        ...,
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1]],

       ...,

       [[0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        ...,
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1]],

       [[0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        ...,
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1]],

       [[0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        ...,
        [0, 0, 0, 1],
        [0, 0, 0, 1],
        [0, 0, 0, 1]]], dtype=uint8)

Code I tried and which worked:

row_cols = []

for ir, row in enumerate(ndarray):
  for ic, col in enumerate(row):
    if col[-1] != 1:
      row_cols.append((ir,ic))

But this is O(N^2) solution and highly time consuming, since the ndarray is of shape (800,1280*4) and I have to perform this on several thousands of arrays.

Is there a better way to filter?

1 Answer 1

1

The numpy slicing op an np.where function will help you:

np.random.seed(2020)
array = np.random.randint(0, 3, 36).reshape([4, 3, 3])

where array is:

array([[[0, 0, 2],
    [1, 0, 1],
    [0, 0, 0]],
   [[2, 1, 2],
    [2, 2, 1],
    [0, 0, 0]],
   [[0, 2, 0],
    [1, 1, 1],
    [2, 1, 2]],
   [[1, 1, 2],
    [2, 2, 2],
    [1, 0, 2]]])

Results of your code:

[(0, 0), (0, 2), (1, 0), (1, 2), (2, 0), (2, 2), (3, 0), (3, 1), (3, 2)]

Using slicing and np.where:

simple_array = array[..., -1]
ir, ic = np.where(simple_array != 1)

The ir is:

array([0, 0, 1, 1, 2, 2, 3, 3, 3], dtype=int64)

The ic is:

array([0, 2, 0, 2, 0, 2, 0, 1, 2], dtype=int64)

Performance:

import numpy as np
from time import time

array = np.random.randint(0, 4, 800 * 1280 * 4).reshape([800, 1280, 4])
start = time()
row_cols = []

for ir, row in enumerate(array):
    for ic, col in enumerate(row):
        if col[-1] != 1:
            row_cols.append((ir, ic))
print(time() - start)  # 0.6560261249542236
start = time()
simple_array = array[..., -1]
ir, ic = np.where(simple_array != 1)
print(time() - start)  # 0.02800583839416504
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.