2

I realize there are quite a number of 'how to sort numpy array'-questions on here already. But I could not find how to do it in this specific way.

I have an array similar to this:

array([[1,0,1,],
    [0,0,1],
    [1,1,1],
    [1,1,0]])

I want to sort the rows, keeping the order within the rows the same. So I expect the following output:

array([[0,0,1,],
    [1,0,1],
    [1,1,0],
    [1,1,1]])
2
  • 3
    a[np.lexsort(a.T[::-1])] Commented Mar 11, 2019 at 18:39
  • Correct! Thank you! Commented Mar 11, 2019 at 18:43

1 Answer 1

3

You can use dot and argsort:

a[a.dot(2**np.arange(a.shape[1])[::-1]).argsort()]
# array([[0, 0, 1],
#        [1, 0, 1],
#        [1, 1, 0],
#        [1, 1, 1]])

The idea is to convert the rows into integers.

a.dot(2**np.arange(a.shape[1])[::-1])
# array([5, 1, 7, 6])

Then, find the sorted indices and use that to reorder a:

a.dot(2**np.arange(a.shape[1])[::-1]).argsort()
# array([1, 0, 3, 2])

My tests show this is slightly faster than lexsort.

a = a.repeat(1000, axis=0)

%timeit a[np.lexsort(a.T[::-1])]
%timeit a[a.dot(2**np.arange(a.shape[1])[::-1]).argsort()]

230 µs ± 18.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
192 µs ± 4.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Verify correctness:

np.array_equal(a[a.dot(2**np.arange(a.shape[1])[::-1]).argsort()], 
               a[np.lexsort(a.T[::-1])])
# True
Sign up to request clarification or add additional context in comments.

7 Comments

np.packbits is probably a bit faster than doing it manually.
@PaulPanzer Thanks, I initially considered a[np.packbits(a, axis=1).ravel().argsort()] however packbits does not return the same result as the dot version so I cannot confirm how correct it is. Any idea why the results differ? Is it the dtype?
@PaulPanzer Okay, reading the documentation, I see packbits pads zeros at the end, not the start (weird). So this means that the relative ordering should remain the same. Let me update.
Ok, I think to make it general (meaning up to 64 columns) you'll probably need something like raw = np.packbits(a, axis=1); out = np.zeros((A.shape[0], 1), 'u8'); N = raw.shape[1]; idx = np.s_[:, 7:7-N if N<8 else None:-1] if sys.byteorder=='little' else np.s_[:, :N]; out.view('u1')[idx] = raw and then use out with argsort. Not sure this will still be faster than yours. Feel free to use this. I won't post it myself. No point in needlessly annoying a prospective mod, is there ;-)) Speaking of which, how is the campaign going?
@PaulPanzer Let me spend a good amount of time digesting that before I decide to use it :) Also, I consider your comments far from annoying, quite the polar opposite in fact—please continue to enlighten us with your wealth of NumPy knowledge, lord knows we need more of it in this tag. Re:elections, there's a lot of strong candidates this time. I don't think I'll win but I didn't expect to win from the beginning anyway. The support has been overwhelming and I'm happy to have come this far. The reason I decided to run this time is because I decided to become the change I wanted to see. :D
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.