Slicing n-dimensional numpy array using list of indices

Question

Say I have a 3 dimensional numpy array:

np.random.seed(1145)
A = np.random.random((5,5,5))

and I have two lists of indices corresponding to the 2nd and 3rd dimensions:

second = [1,2]
third = [3,4]

and I want to select the elements in the numpy array corresponding to

A[:][second][third]

so the shape of the sliced array would be (5,2,2) and

A[:][second][third].flatten()

would be equivalent to to:

In [226]:

for i in range(5):
    for j in second:
        for k in third:
            print A[i][j][k]

0.556091074129
0.622016249651
0.622530505868
0.914954716368
0.729005532319
0.253214472335
0.892869371179
0.98279375528
0.814240066639
0.986060321906
0.829987410941
0.776715489939
0.404772469431
0.204696635072
0.190891168574
0.869554447412
0.364076117846
0.04760811817
0.440210532601
0.981601369658

Is there a way to slice a numpy array in this way? So far when I try A[:][second][third] I get IndexError: index 3 is out of bounds for axis 0 with size 2 because the [:] for the first dimension seems to be ignored.

Warren Weckesser · Accepted Answer · 2014-08-09 22:28:43Z

9

Numpy uses multiple indexing, so instead of A[1][2][3], you can--and should--use A[1,2,3].

You might then think you could do A[:, second, third], but the numpy indices are broadcast, and broadcasting second and third (two one-dimensional sequences) ends up being the numpy equivalent of zip, so the result has shape (5, 2).

What you really want is to index with, in effect, the outer product of second and third. You can do this with broadcasting by making one of them, say second into a two-dimensional array with shape (2,1). Then the shape that results from broadcasting second and third together is (2,2).

For example:

In [8]: import numpy as np

In [9]: a = np.arange(125).reshape(5,5,5)

In [10]: second = [1,2]

In [11]: third = [3,4]

In [12]: s = a[:, np.array(second).reshape(-1,1), third]

In [13]: s.shape
Out[13]: (5, 2, 2)

Note that, in this specific example, the values in second and third are sequential. If that is typical, you can simply use slices:

In [14]: s2 = a[:, 1:3, 3:5]

In [15]: s2.shape
Out[15]: (5, 2, 2)

In [16]: np.all(s == s2)
Out[16]: True

There are a couple very important difference in those two methods.

The first method would also work with indices that are not equivalent to slices. For example, it would work if second = [0, 2, 3]. (Sometimes you'll see this style of indexing referred to as "fancy indexing".)
In the first method (using broadcasting and "fancy indexing"), the data is a copy of the original array. In the second method (using only slices), the array s2 is a view into the same block of memory used by a. An in-place change in one will change them both.

edited Aug 9, 2014 at 22:28

answered Aug 9, 2014 at 21:45

Warren Weckesser

116k20 gold badges207 silver badges224 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Attack68 Over a year ago

Doing s2 = a[:, [1,2], [3,4]] doesn't work as you state, but instead of doing the outer product (which might become more complicated if you have more dimensions to think through) is there a reason why not to use a sequence such as: s2 = a[:, [1,2], :], s2 = s2[:, :, [3,4]]? (I know this is old thread)

Darren Christopher Over a year ago

Hi, sorry for bringing up old thread. Just wondering if in addition to the code-style, is there any performance difference between A[1][2][3] and A[1. 2, 3]? Thanks.

DSM · Accepted Answer · 2014-08-09 21:25:51Z

5

One way would be to use np.ix_:

>>> out = A[np.ix_(range(A.shape[0]),second, third)]
>>> out.shape
(5, 2, 2)
>>> manual = [A[i,j,k] for i in range(5) for j in second for k in third]
>>> (out.ravel() == manual).all()
True

Downside is that you have to specify the missing coordinate ranges explicitly, but you could wrap that into a function.

answered Aug 9, 2014 at 21:25

DSM

355k67 gold badges606 silver badges504 bronze badges

1 Comment

Clemson Over a year ago

This is the answer I've been looking for! Thank you. Definitely think this should be accepted answer as it is the most generally applicable.

tobias_k · Accepted Answer · 2014-08-09 21:41:55Z

2

I think there are three problems with your approach:

Both second and third should be slices
Since the 'to' index is exclusive, they should go from 1 to 3 and from 3 to 5
Instead of A[:][second][third], you should use A[:,second,third]

Try this:

>>> np.random.seed(1145)
>>> A = np.random.random((5,5,5))                       
>>> second = slice(1,3)
>>> third = slice(3,5)
>>> A[:,second,third].shape
(5, 2, 2)
>>> A[:,second,third].flatten()
array([ 0.43285482,  0.80820122,  0.64878266,  0.62689481,  0.01298507,
        0.42112921,  0.23104051,  0.34601169,  0.24838564,  0.66162209,
        0.96115751,  0.07338851,  0.33109539,  0.55168356,  0.33925748,
        0.2353348 ,  0.91254398,  0.44692211,  0.60975602,  0.64610556])

answered Aug 9, 2014 at 21:41

tobias_k

83.1k12 gold badges130 silver badges186 bronze badges

1 Comment

tobias_k Over a year ago

On closer inspection, I think I misunderstood the question: second and third are not supposed to be ranges, but you want exactly those indices -- a bit misleading, using consecutive indices, though. I'll still leave this as an answer here, for completeness.

Collectives™ on Stack Overflow

Slicing n-dimensional numpy array using list of indices

3 Answers 3

2 Comments

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related