1

I'm a beginner in unix shell scripting. I'm trying to sort a csv file based on two columns.

My file looks like below:

sh-4.4$ cat test.csv                                                             
603,02,0123456,1111,201806131115                                        
603,20,0123456,1111,201806131115                                                 
603,02,9876542,2222,201806131215                                                
603,20,9876542,2222,201806131215                                                 
603,02,0123456,1111,201806131117                                                 
603,20,0123456,1111,201806131117  

I want to group by the 3rd column and the 2nd column should also be ordered as shown below:

603,20,0123456,1111,201806131115
603,02,0123456,1111,201806131115
603,20,0123456,1111,201806131117
603,02,0123456,1111,201806131117
603,20,9876542,2222,201806131215
603,02,9876542,2222,201806131215

I tried doing sort -t',' -k3 -k2 test.csv. This does groups the column 3, but it does not sort the column 2. Its output looks like below.

603,02,0123456,1111,201806131115                                             
603,20,0123456,1111,201806131115              
603,02,0123456,1111,201806131117                 
603,20,0123456,1111,201806131117                 
603,02,9876542,2222,201806131215                 
603,20,9876542,2222,201806131215

I also tried sort -t',' -k3 -rk2 test.csv. This however sorts the column 2 as I desired but the column 3 is not sorted as I expected. Its output looks like below.

603,20,9876542,2222,201806131215                                                                                                          
603,02,9876542,2222,201806131215                                                                                                          
603,20,0123456,1111,201806131117                                                                                                          
603,02,0123456,1111,201806131117                                                                                                          
603,20,0123456,1111,201806131115                                                                                                          
603,02,0123456,1111,201806131115

Any help on this is much appreciated. Suggestions to sort using awk is also welcome.

2 Answers 2

2

restrict the sorting fields

$ sort -t, -k3,3 -k2,2 file

should do.

Note however that the output you want doesn't match the spec you describe. You'll get

603,02,0123456,1111,201806131115
603,02,0123456,1111,201806131117
603,20,0123456,1111,201806131115
603,20,0123456,1111,201806131117
603,02,9876542,2222,201806131215
603,20,9876542,2222,201806131215

grouped by third field only and sorted by second field.

Perhaps this is what you wanted?

$ sort -t, -k3 -k2,2r file

603,20,0123456,1111,201806131115
603,02,0123456,1111,201806131115
603,20,0123456,1111,201806131117
603,02,0123456,1111,201806131117
603,20,9876542,2222,201806131215
603,02,9876542,2222,201806131215

note that -k3 means starting from 3rd field to the end, which seems what you want based on the order of the last fields. Also, you want to reorder the rows based on 2nd field in reverse order.

NB. If your numerical fields are not zero padded you may want to add -n option indicate numerical ordering instead of lexical ordering. Here it doesn't make a difference.

Sign up to request clarification or add additional context in comments.

3 Comments

Sorry my spec was confusing. I basically want my output as shown above.
Hi thanks for this. This did work fine when I tried at an online unix terminal for testing. But unfortunately it is not working in my unix terminal which is on SunOs. I doubt whether this command is not working because SunOs!
When I run the command, I get output only like this: :( 603,02,0123456,1111,201806131115 603,20,0123456,1111,201806131115 603,02,0123456,1111,201806131117 603,20,0123456,1111,201806131117 603,02,9876542,2222,201806131215 603,20,9876542,2222,201806131215
1

Sort will work sorting data on csv & txt file , it will print the output on console

-t says columns are delimited by '|' , -k1 -k2 says that-- it will sort te data by column 1 & then by 2

$ sort -t '|' -k1 -k2 <INPUT_FILE>

For storing the result in output file use following command

$ sort -t '|' -k1 -k2 <INPUT_FILE> -o <OUTPUTFILE>

If you wann do it with ignoring header line then use following command

(head -n1 INPUT_FILE && sort <(tail -n+2 INPUT_FILE)) > OUTPUT_FILE

head -n1 INPUT_FILE which will print only the first line of your file i.e. header

& This special tail syntax gets your file from second line up to EOF.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.