1

I have a code that in principle is to open the file content and wrap it with an additional import tag:

with open('oferta-empik.xml', 'r+', encoding='utf-8') as f:
  xml = '<import>' + f.read() + '</import>'
  print(xml)
  f.write(xml)
  f.close()

Unfortunately, after saving half the code is unchanged, and then the xml code already wrapped in the import is inserted into the file.

In total, the file duplicates the xml code where the first original is unchanged and then the same is appended to the end of the file wrapped with the import tag

ORIGINAL CODE:

<offers>
  <offer>
    <leadtime-to-ship>1</leadtime-to-ship>
    <product-id-type>EAN</product-id-type>
    <state>11</state>
    <quantity>0</quantity>
    <price>146</price>
    <sku>B01.001.1.10</sku>
  </offer>
</offer>

AFTER CODE:

<offers>
  <offer>
    <leadtime-to-ship>1</leadtime-to-ship>
    <product-id-type>EAN</product-id-type>
    <state>11</state>
    <quantity>0</quantity>
    <price>146</price>
    <sku>B01.001.1.10</sku>
  </offer>
</offer>
<import><offers>
  <offer>
    <leadtime-to-ship>1</leadtime-to-ship>
    <product-id-type>EAN</product-id-type>
    <state>11</state>
    <quantity>0</quantity>
    <price>146</price>
    <sku>B01.001.1.10</sku>
  </offer>
</offer></import>

2 Answers 2

2

the issue is that you're appending the new text (the new XML) to the end of the file. You're reading the entire file, and then write the modified XML at the end of that file.

There are two solutions:

  • Recommended: open the file for reading. Read the XML. Close it, and then open it for writing and write the entire thing (override the initial content).
  • Not Recommended: After you read, seek to the beginning of the file (with f.seek(0)) and write the new content. This solution is not recommended because if, at some point, the new content is shorter than the original content, the result will be inconsistent / messed-up.
Sign up to request clarification or add additional context in comments.

3 Comments

Not recommended in general: Treating XML as text files. This answer works for text files but it is not a particularly good suggestion for XML.
True (very much). But, for some simple cases, a 'messy' solution is a reasonable approach.
It already falls apart when the source XML file contains an XML declaration (or is in a different encoding than the one used in the open() call). It's never a reasonable approach to treat XML as text.
0

I have a code that in principle is to open the file content and wrap it with an additional import tag

Your current approach is wrong. Don't open XML files as text files, don't treat XML as text. Always use a parser.

This is a lot better:

import xml.etree.ElementTree as ET

# 1: load current document and top level element
old_tree = ET.parse('oferta-empik.xml')
old_root = old_tree.getroot()

# 2: create <import> element to serve as new top level
new_root = ET.Element('import')

# 3: insert current document root ("wrap it in <import>")
new_root.insert(0, old_root)

# 4 make new ElementTree and write it to file 
new_tree = ET.ElementTree(new_root)
with open('output.xml', 'wb') as f:
    new_tree.write(f, encoding='utf8')

Compressed:

new_root = ET.Element('import')
new_root.insert(0, ET.parse('oferta-empik.xml').getroot())
with open('output.xml', 'wb') as f:
    ET.ElementTree(new_root).write(f, encoding='utf8')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.