0

I have a text file (b.txt) like the following:

SOURCE   8 EXPRESSION_SYSTEM_PLASMID: PLEHP20                                   
KEYWDS   2 METHODS, IRON-SULPHUR CLUSTER, METALLOPROTEIN                        
EXPDTA    X-RAY DIFFRACTION                                                     
AUTHOR    E.PARISINI,F.CAPOZZI,P.LUBINI,V.LAMZIN,C.LUCHINAT,                    
REVDAT   4   24-FEB-09 1CKU    1       VERSN                                    
JRNL        PMID   10531472                                                     
REMARK   1 REFERENCE 1         
ATOM      1  N   SER A   1      -8.686  33.363  10.216  1.00 33.39           N  
ANISOU    1  N   SER A   1     2884   4416   5388   1179  -1154   1000       N  
HETATM 1565  O   HOH B 350       7.855  16.938  27.107  1.00 34.27           O  
ANISOU 1565  O   HOH B 350     3399   5455   4168    135   -563    -23       O  

I have a python code (see below) that reads only the lines that start with “ATOM” or “HETATOM”.

pdb_text = open("b.txt","r")
# read contents of the file to string
data = ""
data = pdb_text.read()
for line in data.split('\n'):
    if line.startswith("ATOM") or line.startswith("HETATM"):
        print(line)

The code works well and generates an output as the following.

ATOM      1  N   SER A   1      -8.686  33.363  10.216  1.00 33.39           N  
HETATM 1565  O   HOH B 350       7.855  16.938  27.107  1.00 34.27           O  

However, I want to export the output to a new file. How should I change the last line of the code to do this?

3 Answers 3

1

Try this:

pdb_text = open("b.txt","r")
out = open("out.txt", "w")
# read contents of the file to string
data = ""
data = pdb_text.read()
for line in data.split('\n'):
    if line.startswith("ATOM") or line.startswith("HETATM"):
        out.write(line)
out.close()
pdb_text.close()

This is basically the same python code with but a little bit more pythonic:

with open("b.txt","r") as pdb_text:
    with open("out.txt", "w") as ouy:
        for line in pdb_text:
            if line.startswith("ATOM") or line.startswith("HETATM"):
                out.write(line)

Otherwise, you could just use your shell's redirection capabilities like this:

python your_program.py > out.txt

You even could have accomplished what you have with the python program by running (if you're on macOS, Linux or other UNIX flavor) with this one liner:

cat b.txt | egrep "^ATOM|^HETATM" > out.txt
Sign up to request clarification or add additional context in comments.

Comments

1

You could try this (similar to a previous answer but "a"ppending to the file and not "w"ritting):

pdb_text = open("b.txt","r")
out = open("b.txt", "a")
# read contents of the file to string
data = ""
data = pdb_text.read()
for line in data.split('\n'):
    if line.startswith("ATOM") or line.startswith("HETATM"):
        out.write(line + '\n')
out.close()
pdb_text.close()

2 Comments

Thank you for your response. The code now generates a file. In the output, I expected two lines (one that begins with "ATOM" and one that begins with "HETATM"); however, the code prints the entire output on a single line. What can I do to resolve this issue?
You can change: out.write(line) to out.write(line + "\n") so it will break-line. Hope it helps :)
1

a more concise and pythonic answer would be to use context managers.

this outputs the selected lines to a new file called out.txt

data = ""
with open("b.txt", "r") as pdb_text:
    data = pdb_text.read()

with open("out.txt", "w") as output:

    for line in data.split("\n"):
        if line.startswith("ATOM") or line.startswith("HETATM"):
            output.write(line + "\n")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.