5

I've noticed that python ElementTree module, changes the xml data in the following simple example :

import xml.etree.ElementTree as ET
tree = ET.parse("./input.xml")
tree.write("./output.xml")

I wouldn't expect it to change, as I've done simple read and write test without any modification. however, the results shows a different story, especially in the namespace indices (nonage --> ns0 , d3p1 --> ns1 , i --> ns2 ) :

input.xml:

<?xml version="1.0" encoding="utf-8"?>
<ServerData xmlns:i="http://www.a.org" xmlns="http://schemas.xxx/2004/07/Server.Facades.ImportExport">
<CreationDate>0001-01-01T00:00:00</CreationDate>
<Processes>
    <Processes xmlns:d3p1="http://schemas.datacontract.org/2004/07/Management.Interfaces">
        <d3p1:ProtectedProcess>
            <d3p1:Description>/Applications/Safari.app/Contents/MacOS/Safari</d3p1:Description>
            <d3p1:DiscoveredMachine i:nil="true" />
            <d3p1:Id>0</d3p1:Id>
            <d3p1:Name>/applications/safari.app/contents/macos/safari</d3p1:Name>
            <d3p1:Path>/Applications/Safari.app/Contents/MacOS/Safari</d3p1:Path>
            <d3p1:ProcessHashes xmlns:d5p1="http://schemas.datacontract.org/2004/07/Management.Interfaces.WildFire" />
            <d3p1:Status>1</d3p1:Status>
            <d3p1:Type>Protected</d3p1:Type>
        </d3p1:ProtectedProcess>
    </Processes>
</Processes>

and output.xml:

<ns0:ServerData xmlns:ns0="http://schemas.xxx/2004/07/Server.Facades.ImportExport" xmlns:ns1="http://schemas.datacontract.org/2004/07/Management.Interfaces" xmlns:ns2="http://www.a.org">
<ns0:CreationDate>0001-01-01T00:00:00</ns0:CreationDate>
<ns0:Processes>
    <ns0:Processes>
        <ns1:ProtectedProcess>
            <ns1:Description>/Applications/Safari.app/Contents/MacOS/Safari</ns1:Description>
            <ns1:DiscoveredMachine ns2:nil="true" />
            <ns1:Id>0</ns1:Id>
            <ns1:Name>/applications/safari.app/contents/macos/safari</ns1:Name>
            <ns1:Path>/Applications/Safari.app/Contents/MacOS/Safari</ns1:Path>
            <ns1:ProcessHashes />
            <ns1:Status>1</ns1:Status>
            <ns1:Type>Protected</ns1:Type>
        </ns1:ProtectedProcess>
    </ns0:Processes>
</ns0:Processes>

1 Answer 1

6

You would need to register the namespaces for your xml as well as their prefixes with ElementTree before reading/writing the xml using ElementTree.register_namespace function. Example -

import xml.etree.ElementTree as ET

ET.register_namespace('','http://schemas.xxx/2004/07/Server.Facades.ImportExport')
ET.register_namespace('i','http://www.a.org')
ET.register_namespace('d3p1','http://schemas.datacontract.org/2004/07/Management.Interfaces')

tree = ET.parse("./input.xml")
tree.write("./output.xml")

Without this ElementTree creates its own prefixes for the corresponding namespaces, which is what happens for your case.

This is given in the documentation -

xml.etree.ElementTree.register_namespace(prefix, uri)

Registers a namespace prefix. The registry is global, and any existing mapping for either the given prefix or the namespace URI will be removed. prefix is a namespace prefix. uri is a namespace uri. Tags and attributes in this namespace will be serialized with the given prefix, if at all possible.

(Emphasis mine)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.