1

I have two folders in which one folder contains thousands of images and another folder has corresponding .xml files. XML file and image names are same (i.e. 2007.xml and 2007.jpg). Now i would like to add image name (2007.jpg) into their corresponding file (2007.xml). .xml file format is:

<?xml version='1.0' encoding='ASCII'?>
<annotation>
  <size>
    <width>1820</width>
    <height>940</height>
  </size>
  <object>
    <name>Car</name>
    <bndbox>
      <xmin>74.0</xmin>
      <ymin>509.0</ymin>
      <xmax>236.0</xmax>
      <ymax>609.0</ymax>
    </bndbox>
</annotation>  

i want to add new SubElement

<?xml version='1.0' encoding='ASCII'?>
    <annotation>
      <filename>2007.jpg</filename>
      <size>
        <width>1820</width>
        <height>940</height>
      </size>
      <object>
        <name>Car</name>
        <bndbox>
          <xmin>74.0</xmin>
          <ymin>509.0</ymin>
          <xmax>236.0</xmax>
          <ymax>609.0</ymax>
        </bndbox>
    </annotation>  

I am trying this way:

import xml.etree.ElementTree as ET
import os
doc = ET.parse('00390.xml')
root = doc.getroot()
s = '/image/00390.jpg'
filename = (os.path.basename(s))
userElement = ET.Element("annotation")
newSub = ET.SubElement(userElement, "filename")
newSub.set(filename, '')
root.insert(0, newSub)
tree = ET.ElementTree(root)
tree.write(open('3.xml', 'w'), encoding = 'UTF-8')

Output is received: <filename 00390.jpg=""/> Although output should be <filename>00390.jpg</filename> I think issue is using newSub.set() which takes 3 input argument.

4
  • when i tried to add its always added in the last (just before </annotation>) Commented Mar 12, 2020 at 15:37
  • 1
    You should be able to use insert() which takes an index: docs.python.org/3/library/…. Similar question: stackoverflow.com/q/25824920/407651 Commented Mar 12, 2020 at 15:45
  • mzjn i have added my code into question, please let me know where i am doing wrong? Commented Mar 12, 2020 at 16:15
  • @Sanjay feel free to accept either of the answers that has solved the problem. Commented Mar 13, 2020 at 18:53

2 Answers 2

1

Updated answer for your new problem

import xml.etree.ElementTree as ET
import os
doc = ET.parse('00390.xml')
root = doc.getroot()
s = '/image/00390.jpg'
filename = (os.path.basename(s))
userElement = ET.Element("annotation")
newSub = ET.SubElement(userElement, "filename")
newSub.set(filename, '')#<----- ***** 
root.insert(0, newSub)
tree = ET.ElementTree(root)
tree.write(open('3.xml', 'w'), encoding = 'UTF-8')

The output of this will return

<filename 00390.jpg=""/>

Instead of

<filename>00390.jpg</filename>

This is because at (*) you are setting an attribute value instead of a text in the XML subelement tag.

To solve your problem, replace this newSub.set(filename, '') with this

newSub.text = filename#Assigns text
root.insert(0,newSub)
#Returns this <filename>00390.jpg</filename>

See an example here

Sign up to request clarification or add additional context in comments.

Comments

0

As @mzjn has mentioned try using Element.insert method. This allows you to specify an index on where exactly you want to insert it.

For example, to insert before 2nd element:

import xml.etree.ElementTree as ET

#your tree
root = ET.fromstring('''
<element>
    <att1></att1>
    <att3></att3>
</element>
 ''')

#Create a new element
new = ET.Element('att2')
root.insert(1, new)  # <-----------Insert operaton
print(ET.tostring(root))

#output
"""
<root>
    <att1/>
    <att2/>#newly inserted 
    <att3/>
</root>
"""

Edit:

The ElementTree.write method defaults to us-ascii encoding and as such expects a file opened for writing binary:

The output is either a string (str) or binary (bytes). This is controlled by the encoding argument. If encoding is "unicode", the output is a string; otherwise, it’s binary. Note that this may conflict with the type of file if it’s an open file object; make sure you do not try to write a string to a binary stream and vice versa.

So either open the file for writing in binary mode:

tree.write(open('person.xml', 'wb'))

or open the file for writing in text mode and give "unicode" as encoding:

tree.write(open('person.xml', 'w'), encoding='unicode')

9 Comments

i have added my code please let me know where i am doing wrong?
Try print this print(ET.tostring(root)) before these two lines tree = ET.ElementTree(root) tree.write("33.xml") . Whats the output? Is it modified?
"unicode" option works, it added <filename/> after <annotation> but i need <?xml version='1.0' encoding='ASCII'?> <annotation> <filename>2007.jpg</filename>
i want to add image name (2007.jpg) into their corresponding file like <filename>2007.jpg</filename>
I may be wrong; if I am, do correct me!!! Since the xml and image name files are the same you can use the os module to get the name of the xml and then you can use that as the value in the element.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.