11

Say I have a 3 dimensional numpy array:

np.random.seed(1145)
A = np.random.random((5,5,5))

and I have two lists of indices corresponding to the 2nd and 3rd dimensions:

second = [1,2]
third = [3,4]

and I want to select the elements in the numpy array corresponding to

A[:][second][third]

so the shape of the sliced array would be (5,2,2) and

A[:][second][third].flatten()

would be equivalent to to:

In [226]:

for i in range(5):
    for j in second:
        for k in third:
            print A[i][j][k]

0.556091074129
0.622016249651
0.622530505868
0.914954716368
0.729005532319
0.253214472335
0.892869371179
0.98279375528
0.814240066639
0.986060321906
0.829987410941
0.776715489939
0.404772469431
0.204696635072
0.190891168574
0.869554447412
0.364076117846
0.04760811817
0.440210532601
0.981601369658

Is there a way to slice a numpy array in this way? So far when I try A[:][second][third] I get IndexError: index 3 is out of bounds for axis 0 with size 2 because the [:] for the first dimension seems to be ignored.

0

3 Answers 3

9

Numpy uses multiple indexing, so instead of A[1][2][3], you can--and should--use A[1,2,3].

You might then think you could do A[:, second, third], but the numpy indices are broadcast, and broadcasting second and third (two one-dimensional sequences) ends up being the numpy equivalent of zip, so the result has shape (5, 2).

What you really want is to index with, in effect, the outer product of second and third. You can do this with broadcasting by making one of them, say second into a two-dimensional array with shape (2,1). Then the shape that results from broadcasting second and third together is (2,2).

For example:

In [8]: import numpy as np

In [9]: a = np.arange(125).reshape(5,5,5)

In [10]: second = [1,2]

In [11]: third = [3,4]

In [12]: s = a[:, np.array(second).reshape(-1,1), third]

In [13]: s.shape
Out[13]: (5, 2, 2)

Note that, in this specific example, the values in second and third are sequential. If that is typical, you can simply use slices:

In [14]: s2 = a[:, 1:3, 3:5]

In [15]: s2.shape
Out[15]: (5, 2, 2)

In [16]: np.all(s == s2)
Out[16]: True

There are a couple very important difference in those two methods.

  • The first method would also work with indices that are not equivalent to slices. For example, it would work if second = [0, 2, 3]. (Sometimes you'll see this style of indexing referred to as "fancy indexing".)
  • In the first method (using broadcasting and "fancy indexing"), the data is a copy of the original array. In the second method (using only slices), the array s2 is a view into the same block of memory used by a. An in-place change in one will change them both.
Sign up to request clarification or add additional context in comments.

2 Comments

Doing s2 = a[:, [1,2], [3,4]] doesn't work as you state, but instead of doing the outer product (which might become more complicated if you have more dimensions to think through) is there a reason why not to use a sequence such as: s2 = a[:, [1,2], :], s2 = s2[:, :, [3,4]]? (I know this is old thread)
Hi, sorry for bringing up old thread. Just wondering if in addition to the code-style, is there any performance difference between A[1][2][3] and A[1. 2, 3]? Thanks.
5

One way would be to use np.ix_:

>>> out = A[np.ix_(range(A.shape[0]),second, third)]
>>> out.shape
(5, 2, 2)
>>> manual = [A[i,j,k] for i in range(5) for j in second for k in third]
>>> (out.ravel() == manual).all()
True

Downside is that you have to specify the missing coordinate ranges explicitly, but you could wrap that into a function.

1 Comment

This is the answer I've been looking for! Thank you. Definitely think this should be accepted answer as it is the most generally applicable.
2

I think there are three problems with your approach:

  1. Both second and third should be slices
  2. Since the 'to' index is exclusive, they should go from 1 to 3 and from 3 to 5
  3. Instead of A[:][second][third], you should use A[:,second,third]

Try this:

>>> np.random.seed(1145)
>>> A = np.random.random((5,5,5))                       
>>> second = slice(1,3)
>>> third = slice(3,5)
>>> A[:,second,third].shape
(5, 2, 2)
>>> A[:,second,third].flatten()
array([ 0.43285482,  0.80820122,  0.64878266,  0.62689481,  0.01298507,
        0.42112921,  0.23104051,  0.34601169,  0.24838564,  0.66162209,
        0.96115751,  0.07338851,  0.33109539,  0.55168356,  0.33925748,
        0.2353348 ,  0.91254398,  0.44692211,  0.60975602,  0.64610556])

1 Comment

On closer inspection, I think I misunderstood the question: second and third are not supposed to be ranges, but you want exactly those indices -- a bit misleading, using consecutive indices, though. I'll still leave this as an answer here, for completeness.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.