Partition and sorting of a numpy array

Question

>> arr = [10, 11, 4, 3, 5, 7, 9, 2, 13]
>> np.partition(np.array(arr), -3)

array([ 9,  5,  4,  3,  2,  7, 10, 11, 13])

>> np.sort(np.partition(np.array(arr), -3)[-4:])

array([ 7, 10, 11, 13])

>> np.argpartition(np.array(arr), -3)

array([6, 4, 2, 3, 7, 5, 0, 1, 8], dtype=int64)

>> np.sort(np.argpartition(np.array(arr), -3)[-4:])

array([0, 1, 5, 8], dtype=int64)

what is going on in this code? Actually, I have gone through the documentation but could not understand this numerically.

Have you tested the code piece by piece, starting with the inner most expression (which is evaluated first). Try to understand individual steps first. Better yet, add that analysis to your question, displaying the results at each step. — hpaulj
– hpaulj, Commented Nov 24, 2021 at 4:50
It would be better to use: np.sort(np.partition(np.array(arr), -3)[-3:]), since the partition doesn't promise what the -4 term will be. Note what arr[0,1,5,8] is, [10,11,7,13]`, same numbers as the earlier sort/partition, but in a different order. I don't think this last sort does anything useful. — hpaulj
– hpaulj, Commented Nov 24, 2021 at 9:07

Valdi_Bo · Accepted Answer · 2021-11-24 07:58:35Z

It is a bad practice that you named a plain, pythonic list as arr. For now this is just a list and arrays will be created further on.

To better comprehend what is going on, it is advisable to divide the code into steps and save each partial result under separate variables. This is how I have rewritten your code.

So let's start from:

lst = [10, 11, 4, 3, 5, 7, 9, 2, 13]

The second step is to create an array from this list:

arr1 = np.array(lst)

I decided to name this (and following arrays) as "arr" with consecutive numbers.

The third step is to partition arr1, placing the "threshold" element at the third position from the end:

arr2 = np.partition(arr1, -3)

The result is:

array([ 9,  5,  4,  3,  2,  7, 10, 11, 13])

Details:

The "threshold" element (10) is located at the third position from the end.
All preceding elements are smaller than the threshold.
All following elements are greater or equal to the threshold.
Nothing can be said about the order of elements both before and after the "threshold" element.

Then you want to get last 4 elements of arr2:

arr3 = arr2[-4:]

No surprise, the result is:

array([ 7, 10, 11, 13])

The next step is to sort them:

arr4 = np.sort(arr3)

This time nothing has changed, the content of arr4 is just the same as arr3.

So far you finished your experiments with partition, the second part is to experiment with argpartition:

arr5 = np.argpartition(arr1, -3)

The result is:

array([6, 4, 2, 3, 7, 5, 0, 1, 8], dtype=int64)

It is an array of indices to arr1.

Details:

The third element from the end (0) is the index of the "threshold" element in arr1 (its value is 10).
All previous elements are indices of smaller elements (in arr1).
All following elements are indices of greater or equal elements (in arr1).

Then you get last 4 elements of arr5:

arr6 = arr5[-4:]

getting:

array([5, 0, 1, 8], dtype=int64)

And the last step is to sort them:

arr7 = np.sort(arr6)

getting (no surprise):

array([0, 1, 5, 8], dtype=int64)

That's all.

Collectives™ on Stack Overflow

Partition and sorting of a numpy array

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related