0

I'm trying to make 200 random DNA sequences, and I can't figure out how to make 200 of them! Here's what I have so far:

from random import random

def randABCD(n, probA, probT, probC, probG):
    # where probA + probT + probC + probG == 1
    # n = number of characters in string
    # pX = probability of the character
    cA = probA
    cT = cA + probT
    cC = cT + probC
    def choose():
        r = random()
        if r < cA:
           return 'A'
        elif r < cT:
           return 'T'
        elif r < cC:
           return 'C'
        else:
           return 'G'
    return ''.join([choose() for i in xrange(n)])
1
  • You want to do a thing 200 times. You have a function that does the thing once. Do you know what control flow tool to use to do a thing a certain number of times? Commented Apr 5, 2015 at 2:39

1 Answer 1

1

This code will leverage your function and it will generate 200 sequences of length 10. Note I added a line import random and changed the call to random to random.random. I also change xrange to range since I tested the code using Python 3.x.

Let me know if you had something different in mind for the output

import random

def randABCD(n, probA, probT, probC, probG):
    # where probA + probT + probC + probG == 1
    # n = number of characters in string
    # pX = probability of the character
    cA = probA
    cT = cA + probT
    cC = cT + probC
    def choose():
        r = random.random()
        if r < cA:
            return 'A'
        elif r < cT:
            return 'T'
        elif r < cC:
           return 'C'
        else:
           return 'G'
return ''.join([choose() for i in range(n)])

print(randABCD(10, .25, .25, .25, .25))

print([randABCD(10, .25, .25, .25, .25) for i in range(200)])

Output

from first print call AACCGTCTCT

from second print call ['CCAGTTCGGA', 'ACGGGAAAGT', 'CGTGGTAAGT', 'AACGATTGAG', 'GAGGATATGC', 'AGTGCCTTGT', 'TGACTTGCAC', 'GAAGGAGGCA', 'TCCGGTAGTT', 'TCTGCCGTCG', 'TACATAAGTC', 'GCTGGTTAAC', 'CACCCAGGCC', 'CAAGAGCCAA', 'GCTATTCGAT', 'GTGCTCATCT', 'AAAGCAATAC', 'GTATGGAAAC', 'GTATTGGTAA', 'TGTAATCTTA', 'TCGAATACAT', 'TCCTCAATGG', 'TGTAACGGCA', 'TAGTCACTGT', 'CAAAGCTCAT', 'GTTGAAAGTC', 'CTATCATGAG', 'CAAGCACTAT', 'CTGGGCTGCC', 'CATGTCCAGG', 'ACGTGTGATC', 'AATATGCAAC', 'ACTGATGGAT', 'TATCGCGCGA', 'GTAGACCCAA', 'CAGGATGCAT', 'TACGGCAGAG', 'TATTTTATCA', 'GGTAATCACA', 'TAAACGTATG', 'CTTCCACGCG', 'GGCTCCAAAA', 'CAAGAATAAC', 'TCACGGTCTT', 'AGCGCGTCGA', 'TCACTATCAT', 'TCTGATGTCA', 'AGAAGGTCGT', 'TTAGCGTCTC', 'TGAGATGCGA', 'ATACCCATGC', 'ACCGCTCGAG', 'CCATCAGGCC', 'AACCTTCCCG', 'TCACTCGGGT', 'CGAGACCGGA', 'GCAAGATGAT', 'TGCAATGAGG', 'CCAGATTGGT', 'GGCGATGACA', 'TAGTATGGTT', 'GCAGGTCTCG', 'GGTTTTAACC', 'ACAACCAACT', 'CTGTTCAGTT', 'TTGGAGAGTA', 'AGTCGATCTG', 'TAATGGCAGG', 'CGTCCTTTAA', 'GCGCAACTTC', 'CGGTAGAATG', 'TCTAGCTTGC', 'GCGAAAGCGC', 'GACCCCCGGC', 'GCAATAGTCT', 'ATTGACTCCT', 'ACTAACGCTT', 'TCATAGAAGC', 'GTAGCTGCGT', 'ACAATCTCCT', 'TCGACTCTCT', 'GGCAACAGCA', 'TATTGTAGAC', 'GAGGTCAACG', 'ATGCCAGGGA', 'CTCTCTTTCT', 'CTGCGTGATA', 'GCAAGAATAC', 'AATGCATGAC', 'AGTTCAGGCA', 'GAGATTCCCC', 'CCGCCGACCA', 'ACGACGTGCA', 'TGAACGCCAA', 'GTCGGCTATT', 'TGCTTATCAA', 'AGGAGGCACG', 'TCTACTGCGA', 'GCCTTGACAT', 'GCCTCTCCCC', 'ACACCGACTG', 'ATTTAATCAT', 'GTGCAACGTC', 'GTGTGGCTAA', 'TGGCGATTAA', 'GTATGTCTCC', 'ACTTATGGGC', 'GCTACGTTTT', 'ATCCTCACGT', 'GCCGGCTACA', 'CCCGTGAAGA', 'CATGACCACT', 'GAACCTGATG', 'ACGAGTGTCA', 'ATGTTGGTTT', 'CTTGGAATGA', 'CTTTCCTCAC', 'GATGCTCTTT', 'TAATTCTAAT', 'TCTGGCAAAG', 'CCAGGCCGCG', 'TCATCGCACA', 'CAGCAAGATT', 'GCTTAGGAGG', 'ATATTGTGCG', 'AATATGACGG', 'TCTAGTCCCT', 'GTAAACCGGA', 'TTTAGCGTAC', 'CGAATAGAAC', 'TTAGAATCGG', 'CGCGCGCCTC', 'AATGTTAAGG', 'ACTCGACGCA', 'GGTTGCTTAC', 'TCTGGTGCTC', 'TAATTAGGTA', 'GCCTTAGAAG', 'GTTTATACGC', 'AGCGTCCATA', 'GTATTGTCGA', 'CACCTCAGAA', 'CCCACCTCCG', 'AGACGCTAGA', 'CAAAGCCAGA', 'TCAAATTCAT', 'GGCCATTTGT', 'CAACATGGTA', 'CAAGTGTAAG', 'CGCCGTAACC', 'GGGCGGTAAT', 'GTCCAACCAC', 'CGAAGCGCAG', 'GGCGTCGGAG', 'CTTGTCGCGG', 'GCCCTTCTGC', 'TGCAGCCAAC', 'TGATTTGTTC', 'TGAATTCAGT', 'TAGTCCTGCT', 'GCCCTATGGG', 'CCAGGCTGTT', 'TCCTCAAAAC', 'CTACGGGCAT', 'GCCAACCGAG', 'CAATGGAACT', 'CCTTATCCTC', 'TAAAAGGCTA', 'CGCGTGACAC', 'TGGCGAGCGT', 'CCCCGAGCAT', 'CAGCATTCAA', 'TGTACTGTCC', 'TCCTTGGTTA', 'TCAAAGATGT', 'CGCATACTCA', 'GAATCTATTT', 'ATCATAAGGT', 'ACGCTCTCGC', 'GTCCTCTTAA', 'ATCCGAACCT', 'TGGACTTCCG', 'TCAGATGATA', 'ACTTCATGCG', 'TCCTATACAA', 'ATGGTCTTTA', 'CTAATTCGGT', 'TGCACCACAT', 'GTGACCGTCT', 'ACGTCAGTCA', 'TCTTCCACCT', 'CCGAATACGC', 'ACTATGTCGT', 'TACTATTCCC', 'GGGGGACGCA', 'TGCAGGTTCT', 'GGCTCTGGGG', 'TGGAGCGCTC', 'CGAGATCTTA', 'GGCTTGGCAT']

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.