1

There is a file format called .xyz that helps visualizing molecular bonds. Basically the format asks for a specific pattern:

At the first line there must be the number of atoms, which in my case is 30. After that there should be the data where the first line is the name of the atom, in my case they are all carbon. The second line is the x information and the third line is the y information and the last line is the z information which are all 0 in my case. So something like this:

30 C x1 y1 z1 C x2 y2 z2 ... ... ...

I generated my data in C++ into a text file like this:

C       2.99996     7.31001e-05     0
C       2.93478     0.623697        0
C       2.74092     1.22011     0
C       2.42702     1.76343     0
C       2.0079      2.22961     0
C       1.50006     2.59812     0
C       0.927076        2.8532      0
C       0.313848        2.98349     0
C       -0.313623       2.9837      0
C       -0.927229       2.85319     0
C       -1.5003     2.5981      0
C       -2.00732        2.22951     0
C       -2.42686        1.76331     0
C       -2.74119        1.22029     0
C       -2.93437        0.623802        0
C       -2.99992        -5.5509e-05     0
C       -2.93416        -0.623574       0
C       -2.7409     -1.22022        0
C       -2.42726        -1.7634     0
C       -2.00723        -2.22941        0
C       -1.49985        -2.59809        0
C       -0.92683        -2.85314        0
C       -0.313899       -2.98358        0
C       0.31363     -2.98356        0
C       0.927096        -2.85308        0
C       1.50005     -2.59792        0
C       2.00734     -2.22953        0
C       2.4273      -1.76339        0
C       2.74031     -1.22035        0
C       2.93441     -0.623647       0

So, now what I'm trying to do is that I want to write this file into a .xyz file. I saw online that people do it with Python in which I almost have no experience. So I checked around and came up with this script:

#!/usr/bin/env/python
text_file = open("output.txt","r")
lines = text_file.readlines()
myfile = open("output.xyz","w")
for line in lines:
     atom, x, y, z = line.split()
     myfile.write("%s\t %d\t %d\t %d\t" %(atom,x,y,z))
myfile.close()
text_file.close()

However when I run this, it gives the following error: "%d format: a number is required, not str."

It doesn't make sense to me, since as you can see in txt file, they are all numbers apart from the first line. I tried changing my d's into s's but then the program I'll load this data into gave an error.

To summarize: I have a data file in .txt, I want to change it into .xyz that's been specified but I am running into problems.

Thanks in advance.

2 Answers 2

2

A string can represent a number as well. In programming languages, this is called a type. '1' and 1 have different types. Use %s instead for strings.

myfile.write("%s\t %s\t %s\t %s\t" % (atom, x, y, z))

If you want them to be floats, you should do this during the parsing stage:

x, y, z = map(float, (x, y, z))

And btw, % is considered obsolete in python. Please use format instead:

myfile.write("{}\t {}\t {}\t {}\t".format(atom,x,y,z))
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you but if I wrote x, y, z = map(float, (x,y,z)) inside my for loop, it wouldn't be correct right? If I do it outside of the for loop, then it wouldn't work as well, because there is no definition for them at that point. Where do you suggest me to put that line of code?
Yeah, you should do it in your for loop. Right after line.split().
Many thanks, I don't get any errors now in the code, but somehow it still doesn't work. Tomorrow morning I'll try to come up with a code that reads the .xyz file to check if what I'm writing to the file is correct. If that's the case, then maybe I'm doing something else wrong. Because for some reason, I am still getting errors, I'll see to it once I understand if I am writing the data the correct format in this way. Thanks again for the valuable suggestions.
No worries, what errors are you getting? If that's a new question please ask it as one. You can accept my answer if it's answered your original question.
Thanks again, I'll ask another question
1

Maybe the problem you've faced was because of "\t" in the answer (tab).

The .xyz file uses only spaces to separate data from the same line, as stated here. You could use only one space if you wanted, but to have an easily readable format, like other programs use to format when saving .xyz files, it's better to use the tips from https://pyformat.info/

My working code (in Python 3) to generate .xyz files, using objects of molecules and atoms from the CSD Python API library, is this one, that you can adapt to your reality:

with open("file.xyz", 'w') as xyz_file:
    xyz_file.write("%d\n%s\n" % (len(molecule.atoms), title))
    for atom in molecule.atoms:
        xyz_file.write("{:4} {:11.6f} {:11.6f} {:11.6f}\n".format(
            atom.atomic_symbol, atom.coordinates.x, atom.coordinates.y, atom.coordinates.z))

The first two lines are the number of atoms and the title for the xyz file. The other lines are the atoms (atomic symbol and 3D position).

So, the atomic symbol has 4 spaces for it, aligned to the left: {:4}

Then, this happens 3 times: {:11.6f}

That means one space followed by the next coordinate, that uses 11 characters aligned to the right, where 6 are after the decimal point. It's sufficient for numbers from -999.999999 to 9999.999999, that use to be sufficient. Numbers out of this interval only break the format, but keep the mandatory space between data, so the xyz file still works on those cases.

The result is like this:

18
WERPOY
N       0.655321    3.658330   14.594159
C       1.174111    4.551873   13.561374
C       0.703656    3.889113   15.926147
S       1.455530    5.313258   16.574524
C       1.127601    5.061435   18.321297
N       0.146377    2.914698   16.639984
C      -0.288067    2.014580   15.736297
C       0.014111    2.441298   14.475693
N      -0.266880    1.787085   13.260652
O      -0.831165    0.699580   13.319885
O       0.056329    2.322290   12.209641
H       0.439235    4.780025   12.970561
H       1.821597    4.148825   13.137629
H       1.519448    5.312600   13.775525
H       1.522843    5.786000   18.590124
H       0.171557    5.069325   18.423056
H       1.477689    4.168550   18.574936
H      -0.665073    1.282125   16.053727

2 Comments

If you're using the CSD-Python-API you can just use the MolecularWriter Class to export out the xyz file BTW :D
@AlexAMC Cool! I searched the documentation and didn't find it at the time. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.