I am trying to write a python script to manipulate the excel spreadsheet.
Suppose if,Ihave the sample data:
Gene chrom strand TSS TES Name
NM_145215 chr5 + 135485168 135488045 Abhd11
NM_1190437 chr5 + 135485021 135488045 Abhd11
NM_1205181 chr14 + 54873803 54888844 Abhd4
NM_134076 chr14 + 54878906 54888844 Abhd4
NM_9594 chr2 + 31615464 31659747 Abl1
NM_1112703 chr2 + 31544075 31659747 Abl1
NM_207624 chr11 + 105829258 105851278 Abl1
NM_9598 chr11 + 105836521 105851278 Ace2
NM_1130513 chrX + 160577273 160626350 Ace2
NM_27286 chrX + 160578411 160626350 Ace2
For those similar names(column 6), I want to retrieve the whole row with least TSS. Example, for first 2 rows-Abhd11 name, I want to save the 2nd row in my result since the TSS 135485021 < 135485168. So on for all the sets with same NAMES.
Any ideas and comments are appreciated.