83

is there a builtin function of Python that does on python.array what argsort() does on a numpy.array?

0

4 Answers 4

113

There is no built-in function, but it's easy to assemble one out of the terrific tools Python makes available:

def argsort(seq):
    # http://stackoverflow.com/questions/3071415/efficient-method-to-calculate-the-rank-vector-of-a-list-in-python
    return sorted(range(len(seq)), key=seq.__getitem__)

x = [5,2,1,10]

print(argsort(x))
# [2, 1, 0, 3]

It works on Python array.arrays the same way:

import array
x = array.array('d', [5, 2, 1, 10])
print(argsort(x))
# [2, 1, 0, 3]
Sign up to request clarification or add additional context in comments.

5 Comments

Instead of using the (theoretically private) getitem, you can also use operator.itemgetter / operator.attrgetter docs.python.org/library/operator.html
If operator.itemgetter could be used as a drop-in replacement for __getitem__, I think I'd agreed with you Ender, but as far as I can see, operator.itemgetter would also require wrapping it in a lambda expression. I'd rather avoid the extra lambda if I could.
@Ender: itemgetter is no use here: x.__getitem__(i) returns x[i], whereas itemgetter(x)(i) will return i[x].
In my opinion, key=lambda i: seq[i] might be easier to understand.
agreed with comment above (key=lambda i: seq[i]) might be easier to read- but still great!
86

I timed the suggestions above and here are my results.

import timeit
import random
import numpy as np

def f(seq):
    # http://stackoverflow.com/questions/3382352/equivalent-of-numpy-argsort-in-basic-python/3383106#3383106
    #non-lambda version by Tony Veijalainen
    return [i for (v, i) in sorted((v, i) for (i, v) in enumerate(seq))]

def g(seq):
    # http://stackoverflow.com/questions/3382352/equivalent-of-numpy-argsort-in-basic-python/3383106#3383106
    #lambda version by Tony Veijalainen
    return [x for x,y in sorted(enumerate(seq), key = lambda x: x[1])]


def h(seq):
    #http://stackoverflow.com/questions/3382352/equivalent-of-numpy-argsort-in-basic-python/3382369#3382369
    #by unutbu
    return sorted(range(len(seq)), key=seq.__getitem__)


seq = list(range(10000))
random.shuffle(seq)

n_trials = 100
for cmd in [
        'f(seq)', 'g(seq)', 'h(seq)', 'np.argsort(seq)',
        'np.argsort(seq).tolist()'
        ]:
    t = timeit.Timer(cmd, globals={**globals(), **locals()})
    print('time for {:d}x {:}: {:.6f}'.format(n_trials, cmd, t.timeit(n_trials)))

output

time for 100x f(seq): 0.323915
time for 100x g(seq): 0.235183
time for 100x h(seq): 0.132787
time for 100x np.argsort(seq): 0.091086
time for 100x np.argsort(seq).tolist(): 0.104226

A problem size dependent analysis is given here.

3 Comments

Interesting - probably the average is more important than the 'best' of 3(?)
The average is affected by outliers. You do not want the results be polluted by other programs running or hardware cache misses happenstances.
For future readers, %timeit is reporting the best average from 3 averages of 100 loops each.
9

My alternative with enumerate:

def argsort(seq):
    return [x for x,y in sorted(enumerate(seq), key = lambda x: x[1])]

seq=[5,2,1,10]
print(argsort(seq))
# Output:
# [2, 1, 0, 3]

Better though to use answer from https://stackoverflow.com/users/9990/marcelo-cantos answer to thread python sort without lambda expressions

[i for (v, i) in sorted((v, i) for (i, v) in enumerate(seq))]

Comments

5

Found this question, but needed argsort for a list of objects based on an object property.

Extending unutbu's answer, this would be:

sorted(range(len(seq)), key = lambda x: seq[x].sort_property)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.