47

I've got a strange situation.

I have a 2D Numpy array, x:

x = np.random.random_integers(0,5,(20,8))

And I have 2 indexers--one with indices for the rows, and one with indices for the column. In order to index X, I am having to do the following:

row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,column_indices]

Instead of just:

x_new = x[row_indices,column_indices]

(which fails with: error, cannot broadcast (20,) with (2,))


I'd like to be able to do the indexing in one line using the broadcasting, since that would keep the code clean and readable...also, I don't know all that much about python under the hood, but as I understand it, it should be faster to do it in one line (and I'll be working with pretty big arrays).


Test Case:

x = np.random.random_integers(0,5,(20,8))

row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,col_indices]

x_doesnt_work = x[row_indices,col_indices]
8
  • Include a sample case? Commented Feb 24, 2016 at 16:37
  • 1
    Nitpick: np.random.randint(0, 6) is perferred to np.random.random_integers(0, 5). Commented Feb 24, 2016 at 16:39
  • And the expected output for that case? Commented Feb 24, 2016 at 16:41
  • What is your expected result? Are you trying to get all elements in columns 1, 2 of the selected rows? Commented Feb 24, 2016 at 16:41
  • 3
    try this: x_new = x[row_indices,:][:,col_indices] Commented Feb 24, 2016 at 16:45

5 Answers 5

60

Selections or assignments with np.ix_ using indexing or boolean arrays/masks

1. With indexing-arrays

A. Selection

We can use np.ix_ to get a tuple of indexing arrays that are broadcastable against each other to result in a higher-dimensional combinations of indices. So, when that tuple is used for indexing into the input array, would give us the same higher-dimensional array. Hence, to make a selection based on two 1D indexing arrays, it would be -

x_indexed = x[np.ix_(row_indices,col_indices)]

B. Assignment

We can use the same notation for assigning scalar or a broadcastable array into those indexed positions. Hence, the following works for assignments -

x[np.ix_(row_indices,col_indices)] = # scalar or broadcastable array

2. With masks

We can also use boolean arrays/masks with np.ix_, similar to how indexing arrays are used. This can be used again to select a block off the input array and also for assignments into it.

A. Selection

Thus, with row_mask and col_mask boolean arrays as the masks for row and column selections respectively, we can use the following for selections -

x[np.ix_(row_mask,col_mask)]

B. Assignment

And the following works for assignments -

x[np.ix_(row_mask,col_mask)] = # scalar or broadcastable array

Sample Runs

1. Using np.ix_ with indexing-arrays

Input array and indexing arrays -

In [221]: x
Out[221]: 
array([[17, 39, 88, 14, 73, 58, 17, 78],
       [88, 92, 46, 67, 44, 81, 17, 67],
       [31, 70, 47, 90, 52, 15, 24, 22],
       [19, 59, 98, 19, 52, 95, 88, 65],
       [85, 76, 56, 72, 43, 79, 53, 37],
       [74, 46, 95, 27, 81, 97, 93, 69],
       [49, 46, 12, 83, 15, 63, 20, 79]])

In [222]: row_indices
Out[222]: [4, 2, 5, 4, 1]

In [223]: col_indices
Out[223]: [1, 2]

Tuple of indexing arrays with np.ix_ -

In [224]: np.ix_(row_indices,col_indices) # Broadcasting of indices
Out[224]: 
(array([[4],
        [2],
        [5],
        [4],
        [1]]), array([[1, 2]]))

Make selections -

In [225]: x[np.ix_(row_indices,col_indices)]
Out[225]: 
array([[76, 56],
       [70, 47],
       [46, 95],
       [76, 56],
       [92, 46]])

As suggested by OP, this is in effect same as performing old-school broadcasting with a 2D array version of row_indices that has its elements/indices sent to axis=0 and thus creating a singleton dimension at axis=1 and thus allowing broadcasting with col_indices. Thus, we would have an alternative solution like so -

In [227]: x[np.asarray(row_indices)[:,None],col_indices]
Out[227]: 
array([[76, 56],
       [70, 47],
       [46, 95],
       [76, 56],
       [92, 46]])

As discussed earlier, for the assignments, we simply do so.

Row, col indexing arrays -

In [36]: row_indices = [1, 4]

In [37]: col_indices = [1, 3]

Make assignments with scalar -

In [38]: x[np.ix_(row_indices,col_indices)] = -1

In [39]: x
Out[39]: 
array([[17, 39, 88, 14, 73, 58, 17, 78],
       [88, -1, 46, -1, 44, 81, 17, 67],
       [31, 70, 47, 90, 52, 15, 24, 22],
       [19, 59, 98, 19, 52, 95, 88, 65],
       [85, -1, 56, -1, 43, 79, 53, 37],
       [74, 46, 95, 27, 81, 97, 93, 69],
       [49, 46, 12, 83, 15, 63, 20, 79]])

Make assignments with 2D block(broadcastable array) -

In [40]: rand_arr = -np.arange(4).reshape(2,2)

In [41]: x[np.ix_(row_indices,col_indices)] = rand_arr

In [42]: x
Out[42]: 
array([[17, 39, 88, 14, 73, 58, 17, 78],
       [88,  0, 46, -1, 44, 81, 17, 67],
       [31, 70, 47, 90, 52, 15, 24, 22],
       [19, 59, 98, 19, 52, 95, 88, 65],
       [85, -2, 56, -3, 43, 79, 53, 37],
       [74, 46, 95, 27, 81, 97, 93, 69],
       [49, 46, 12, 83, 15, 63, 20, 79]])

2. Using np.ix_ with masks

Input array -

In [19]: x
Out[19]: 
array([[17, 39, 88, 14, 73, 58, 17, 78],
       [88, 92, 46, 67, 44, 81, 17, 67],
       [31, 70, 47, 90, 52, 15, 24, 22],
       [19, 59, 98, 19, 52, 95, 88, 65],
       [85, 76, 56, 72, 43, 79, 53, 37],
       [74, 46, 95, 27, 81, 97, 93, 69],
       [49, 46, 12, 83, 15, 63, 20, 79]])

Input row, col masks -

In [20]: row_mask = np.array([0,1,1,0,0,1,0],dtype=bool)

In [21]: col_mask = np.array([1,0,1,0,1,1,0,0],dtype=bool)

Make selections -

In [22]: x[np.ix_(row_mask,col_mask)]
Out[22]: 
array([[88, 46, 44, 81],
       [31, 47, 52, 15],
       [74, 95, 81, 97]])

Make assignments with scalar -

In [23]: x[np.ix_(row_mask,col_mask)] = -1

In [24]: x
Out[24]: 
array([[17, 39, 88, 14, 73, 58, 17, 78],
       [-1, 92, -1, 67, -1, -1, 17, 67],
       [-1, 70, -1, 90, -1, -1, 24, 22],
       [19, 59, 98, 19, 52, 95, 88, 65],
       [85, 76, 56, 72, 43, 79, 53, 37],
       [-1, 46, -1, 27, -1, -1, 93, 69],
       [49, 46, 12, 83, 15, 63, 20, 79]])

Make assignments with 2D block(broadcastable array) -

In [25]: rand_arr = -np.arange(12).reshape(3,4)

In [26]: x[np.ix_(row_mask,col_mask)] = rand_arr

In [27]: x
Out[27]: 
array([[ 17,  39,  88,  14,  73,  58,  17,  78],
       [  0,  92,  -1,  67,  -2,  -3,  17,  67],
       [ -4,  70,  -5,  90,  -6,  -7,  24,  22],
       [ 19,  59,  98,  19,  52,  95,  88,  65],
       [ 85,  76,  56,  72,  43,  79,  53,  37],
       [ -8,  46,  -9,  27, -10, -11,  93,  69],
       [ 49,  46,  12,  83,  15,  63,  20,  79]])
Sign up to request clarification or add additional context in comments.

1 Comment

Ah, so if I take the transpose of row_indices, should be the same?
12

What about:

x[row_indices][:,col_indices]

For example,

x = np.random.random_integers(0,5,(5,5))
## array([[4, 3, 2, 5, 0],
##        [0, 3, 1, 4, 2],
##        [4, 2, 0, 0, 3],
##        [4, 5, 5, 5, 0],
##        [1, 1, 5, 0, 2]])

row_indices = [4,2]
col_indices = [1,2]
x[row_indices][:,col_indices]
## array([[1, 5],
##        [2, 0]])

3 Comments

This is fine for fetch, but fails for assignment. The row index makes a copy.
@hpaulj both make a copy because of advanced indexing, right?
@wedran, with one layer of indexing, x[[1,2,3]] = 2, the distinction between view and copy doesn't matter (x.__setitem__(..., 2). It's when you use sequential indexing that you have to pay attention to that issue. Then the __setitem__ modifies the first __getitem__.
8
import numpy as np
x = np.random.random_integers(0,5,(4,4))
x
array([[5, 3, 3, 2],
       [4, 3, 0, 0],
       [1, 4, 5, 3],
       [0, 4, 3, 4]])

# This indexes the elements 1,1 and 2,2 and 3,3
indexes = (np.array([1,2,3]),np.array([1,2,3]))
x[indexes]
# returns array([3, 5, 4])

Notice that numpy has very different rules depending on what kind of indexes you use. So indexing several elements should be by a tuple of np.ndarray (see indexing manual).

So you need only to convert your list to np.ndarray and it should work as expected.

2 Comments

when elements of "indexes" tuple are of different size, it doesn't seem to work
@ShihabShahriar What would different sizes indicate? That you have partial indices? It seems like you have a different question, if that's the case please ask a new question.
5

I think you are trying to do one of the following (equlvalent) operations:

x_does_work = x[row_indices,:][:,col_indices]
x_does_work = x[:,col_indices][row_indices,:]

This will actually create a subset of x with only the selected rows, then select the columns from that, or vice versa in the second case. The first case can be thought of as

x_does_work = (x[row_indices,:])[:,col_indices]

1 Comment

I like how explicit this is.
4

Your first try would work if you write it with np.newaxis

x_new = x[row_indices[:, np.newaxis],column_indices]

1 Comment

I think, you might need to change row[column]_indices into np.array.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.