2

I have the following lines in a file. Here is an example of one line:

NM_???? chr12 - 10 110 10 110 3 10,50,100, 20,60,110,

I have the following code to get the info out:

fp = open(infile, 'r')
for line in fp:
     tokens = line.split()
     exonstarts = tokens[8][:-1].split(',')
     exonends = tokens[9][:-1].split(',')

This will give me a list like these:

exonstarts = [10,50,100]
exonends = [20,60,110]

This has 3 exons (ALTHOUGH OTHER LINES IN THE FILE MAY HAVE MORE OR LESS THAN 3, so this must work for any number of exons), and they go from:

 10-20
 50-60
 100-110

So for each number in the start list there is one in the finish list. Which means that the first codon start at exonstarts[0] and ends at exonends[0]. The second starts at exonstarts[1] and ends at exonends[1]. And so on.

How do I write the rest of this code so it pairs up the elements as such?


Update:

From this:

tokens = line.split()
exonstarts = tokens[8][:-1].split(',')
exonends = tokens[9][:-1].split(',')
zipped = list(zip(exonstarts, exonends))

I have another problem, I have a sting that I want these pieces of. So for example, I would want chr_string[10:20]+chr_string[50:60]+chr_string[100:110] Is there a way I could easily say this??

6
  • Sorry for the noob programming question. I just really need some help for the time crunch I have found myself in Commented Apr 28, 2012 at 0:02
  • 5
    No need to apologize Patrick, that's what the site is here for :) Commented Apr 28, 2012 at 0:05
  • 1
    @PatrickCampbell: In general, it's preferred that you open a new question for followups like that. Commented Apr 28, 2012 at 0:36
  • About your second question: try running that code. It should work. Commented Apr 28, 2012 at 0:36
  • Oh, I know it will work, but considering I will have a variable list of numbers and length, I cannot simply write that. I need some kind of variable or something to include in them Commented Apr 28, 2012 at 0:38

3 Answers 3

3

The zip built-in is what your looking for:

>>> exonstarts = [10,50,100]
>>> exonends = [20,60,110]
>>> zip(exonstarts,exonends)
[(10, 20), (50, 60), (100, 110)]
Sign up to request clarification or add additional context in comments.

Comments

2

I believe you want the zip function.

In [1]: exonstarts = [10,50,100]

In [2]: exonends = [20,60,110]

In [3]: zip(exonstarts, exonends)
Out[3]: [(10, 20), (50, 60), (100, 110)]

4 Comments

Is the zip function built in? Because when I try zip(exonstarts, exonends) all I get back is <zip object at 0x0290A8C8>
The zip function is builtin unless you overwrote it. It was added in version 2.0
In python3, I believe it returns an iterator. Try list(zip(exonstarts, exonends))
zip() returns an object in Python 3. Try list(zip(exonstarts, exonends)) to see the contents.
0

You can get these pairs using zip():

>>> for t in zip(exonstarts, exonends):
...     print('%d-%d' % t)
... 
10-20
50-60
100-110

To get a list by slicing chr_string (which I have fabricated) using these pairs:

>>> [chr_string[start:end] for start,end in zip(exonstarts, exonends)]
['0506070809', '2526272829', '5051525354']

To join these together:

>>> ''.join(chr_string[start:end] for start,end in zip(exonstarts, exonends))
'050607080925262728295051525354'

2 Comments

So, another problem, I have a sting that I want these pieces of. So for example, I would want chr_string[10:20]+chr_string[50:60]+chr_string[100:110] Is there a way I could easily say this??
@PatrickCampbell: It would help if you explained the actual problem that you are trying to solve. See also: meta.stackexchange.com/questions/66377/what-is-the-xy-problem

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.