I have a file with strings like the following:
NM_???? chr12 - 10 110 10 110 3 10,50,100, 20,60,110,
I am interested in the last two columns, the first being a comma-separeted list of exonstarts and the last being a comma-separated list of exonends.
That said, I have done the following:
fp = open(infile, 'r')
for line in fp:
tokens = line.split()
exonstarts = tokens[8][:-1].split(',')
exonends = tokens[9][:-1].split(',')
zipped = list(zip(exonstarts, exonends))
now that I have a list that looks like this:
[(10, 20), (50, 60), (100, 110)]
I have another problem, I have a sting that I want these pieces of. So for example, I would want chr_string[10:20]+chr_string[50:60]+chr_string[100:110] Is there a way I could easily say this??
[10:20]or[10:21]. The stop index on a slice is non-inclusive.