0

I have the data of the form:

#---------------------
# Data
#---------------------
p   q   r
2   8   14
2   9   22
1   5   19
2   7   19
3   11  13
2   7   20
1   4   15
3   12  17
1   4   14
1   5   20
2   7   17
3   10  13
3   11  20
3   11  14
1   6   18
3   12  16
2   9   21
3   10  19
2   8   13
1   6   22
1   4   13
2   8   15
3   12  15
3   10  16
2   9   16
1   5   16
1   6   21

Now I need to sort this data using NumPy in the following manner:

  1. Ascending order for column p.
  2. Ascending order for column q.
  3. Descending order for column r.

I used the following code, but it does not sort correctly:

import numpy as np

data = open('data.dat', "r")
line = data.readline()
while line.startswith('#'):
    line = data.readline()
data_header = line.split("\t")
data_header[-1] = data_header[-1].strip()

_data_ = np.genfromtxt(data, comments='#', delimiter='\t', names = data_header, dtype = None, unpack = True).transpose()            # Read space-separated values in engine data file.
sorted_index =  np.lexsort((_data_['r'][::-1], _data_['q'], _data_['p']))
_data_ =  _data_[sorted_index]
print (_data_)

Ouptut

1   4   15
1   4   14
1   4   13
1   5   19
1   5   20
1   5   16
1   6   21
1   6   22
1   6   18
2   7   20
2   7   19
2   7   17
2   8   13
2   8   15
2   8   14
2   9   22
2   9   21
2   9   16
3   10  13
3   10  16
3   10  19
3   11  14
3   11  13
3   11  20
3   12  16
3   12  15
3   12  17

What could be possibly wrong in this sorting method?

2 Answers 2

1

Actually, you were almost there! Just modify this one line in your code (a minus sign instead of a [::-1]) and it works

sorted_index =  np.lexsort((-_data_['r'], _data_['q'], _data_['p']))

More generically formulated, this minus sign is a bit of a hack but it should work as long as you are only dealing with numerical values

# 2D array will be sorted first by p, then by q (if p is the same), then by r
sortkeys = ['p','q','r']
# 1 is ascending/forward sort, -1 is descending/reverse sort
sortdirection = [1,1,-1]
# need [::-1] as its sorts with last element first...
ind = np.lexsort(tuple([(_data_[skey])*sdir for skey,sdir in
                                     zip(sortkeys[::-1],sortdirection[::-1])]))
_data_ = _data_[ind]
for i in _data_:
    print(i)

Output:

(1, 4, 15)
(1, 4, 14)
(1, 4, 13)
(1, 5, 20)
(1, 5, 19)
(1, 5, 16)
(1, 6, 22)
(1, 6, 21)
(1, 6, 18)
(2, 7, 20)
(2, 7, 19)
(2, 7, 17)
(2, 8, 15)
(2, 8, 14)
(2, 8, 13)
(2, 9, 22)
(2, 9, 21)
(2, 9, 16)
(3, 10, 19)
(3, 10, 16)
(3, 10, 13)
(3, 11, 20)
(3, 11, 14)
(3, 11, 13)
(3, 12, 17)
(3, 12, 16)
(3, 12, 15)
Sign up to request clarification or add additional context in comments.

Comments

0

The problem is, that when you reverse the r column, numpy doesn't know your indices change. A workaround would be a sort in two steps, but it wouldn't be elegant:

pre_sort_index = np.lexsort((_data_['r'],), axis=0)[::-1]
sorted_index =  np.lexsort((_data_[pre_sort_index]['q'], _data_[pre_sort_index]['p']))
_data_ =  _data_[pre_sort_index][sorted_index]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.