1

I'm a total noob in coding, I study IT, and have a school project in which I must convert a .txt file in a XML file. I have managed to create a tree, and subelements, but a must put some XML namespace in the code. Because the XML file in the end must been opened in a program that gives you a table of the informations, and something more. But without the scheme from the XML namespace it won't open anything. Can someone help me in how to put a .xsd in my code?

This is the scheme: http://www.pufbih.ba/images/stories/epp_docs/PaketniUvozObrazaca_V1_0.xsd

Example of XML file a must create: http://www.pufbih.ba/images/stories/epp_docs/4200575050089_1022.xml

And in the first row a have the scheme that I must input: "urn:PaketniUvozObrazaca_V1_0.xsd"

This is the code a created so far:

import xml.etree.ElementTree as xml

def GenerateXML(GIP1022):
root=xml.Element("PaketniUvozObrazaca")
p1=xml.Element("PodaciOPoslodavcu")
root.append(p1)

jib=xml.SubElement(p1,"JIBPoslodavca")
jib.text="4254160150005"
pos=xml.SubElement(p1,"NazivPoslodavca")
pos.text="MOJATVRTKA d.o.o. ORAŠJE"
zah=xml.SubElement(p1,"BrojZahtjeva")
zah.text="8"
datz=xml.SubElement(p1,"DatumPodnosenja")
datz.text="2021-01-01"

tree=xml.ElementTree(root)
with open(GIP1022,"wb") as files:
    tree.write(files)

if __name__=="__main__":
GenerateXML("primjer.xml")

3 Answers 3

1

The official documentation is not super explicit as to how one works with namespaces in ElementTree, but the core of it is that ElementTree takes a very fundamental(ist) approach: instead of manipulating namespace prefixes / aliases, elementtree uses Clark's Notation.

So e.g.

<bar xmlns="foo">

or

<x:bar xmlns:x="foo">

(the element bar in the foo namespace) would be written

{foo}bar
>>> tostring(Element('{foo}bar'), encoding='unicode')
'<ns0:bar xmlns:ns0="foo" />'

alternatively (and sometimes more conveniently for authoring and manipulating) you can use QName objects which can either take a Clark's notation tag name, or separately take a namespace and a tag name:

>>> tostring(Element(QName('foo', 'bar')), encoding='unicode')
'<ns0:bar xmlns:ns0="foo" />'

So while ElementTree doesn't have a namespace object per-se you can create namespaced object like this, probably via a helper partially applying QName:

>>> root = Element(ns("PaketniUvozObrazaca"))
>>> SubElement(root, ns("PodaciOPoslodavcu"))
<Element <QName '{urn:PaketniUvozObrazaca_V1_0.xsd}PodaciOPoslodavcu'> at 0x7f502481bdb0>
>>> tostring(root, encoding='unicode')
'<ns0:PaketniUvozObrazaca xmlns:ns0="urn:PaketniUvozObrazaca_V1_0.xsd"><ns0:PodaciOPoslodavcu /></ns0:PaketniUvozObrazaca>'

Now there are a few important considerations here:

First, as you can see the prefix when serialising is arbitrary, this is in keeping with ElementTree's fundamentalist approach to XML (the prefix should not matter), but it has since grown a "register_namespace" global function which allows registering specific prefixes:

>>> register_namespace('xxx', 'urn:PaketniUvozObrazaca_V1_0.xsd')
>>> tostring(root, encoding='unicode')
'<xxx:PaketniUvozObrazaca xmlns:xxx="urn:PaketniUvozObrazaca_V1_0.xsd"><xxx:PodaciOPoslodavcu /></xxx:PaketniUvozObrazaca>'

you can also pass a single default_namespace to (some) serialization function to specify the, well, default namespace:

>>> tostring(root, encoding='unicode', default_namespace='urn:PaketniUvozObrazaca_V1_0.xsd')
'<PaketniUvozObrazaca xmlns="urn:PaketniUvozObrazaca_V1_0.xsd"><PodaciOPoslodavcu /></PaketniUvozObrazaca>'

A second, possibly larger, issue is that ElementTree does not support validation.

The Python standard library does not provide support for any validating parser or tree builder, whether DTD, rng, xml schema, anything. Not by default, and not optionally.

lxml is probably the main alternative supporting validation (of multiple types of schema), its core API follows ElementTree but extends it in multiple ways and directions (including much more precise namespace prefix support, and prefix round-tripping). But even then the validation is (AFAIK) mostly explicit, at least when generating / serializing documents.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the quick response, in truth and this answer is a little bit hard (advanced) for me, but I will do the best to comprehend it. xD
0

What you want is to add a default namespace declaration (xmlns="urn:PaketniUvozObrazaca_V1_0.xsd") to the root element. I have edited the code in the question to show you how this can be done.

import xml.etree.ElementTree as ET

def GenerateXML(GIP1022): 
    # Create the PaketniUvozObrazaca root element in the urn:PaketniUvozObrazaca_V1_0.xsd namespace 
    root = ET.Element("{urn:PaketniUvozObrazaca_V1_0.xsd}PaketniUvozObrazaca")

    # Add subelements
    p1 = ET.Element("PodaciOPoslodavcu")
    root.append(p1)

    jib = ET.SubElement(p1,"JIBPoslodavca")
    jib.text = "4254160150005"
    pos = ET.SubElement(p1,"NazivPoslodavca")
    pos.text = "MOJATVRTKA d.o.o. ORAŠJE"
    zah = ET.SubElement(p1,"BrojZahtjeva")
    zah.text = "8"
    datz = ET.SubElement(p1,"DatumPodnosenja")
    datz.text = "2021-01-01"

    # Make urn:PaketniUvozObrazaca_V1_0.xsd the default namespace (no prefix)
    ET.register_namespace("", "urn:PaketniUvozObrazaca_V1_0.xsd")

    # Prettify output (requires Python 3.9)
    ET.indent(root)

    tree = ET.ElementTree(root)

    with open(GIP1022,"wb") as files:
        tree.write(files)

if __name__=="__main__":
    GenerateXML("primjer.xml")

Contents of primjer.xml:

<PaketniUvozObrazaca xmlns="urn:PaketniUvozObrazaca_V1_0.xsd">
  <PodaciOPoslodavcu>
    <JIBPoslodavca>4254160150005</JIBPoslodavca>
    <NazivPoslodavca>MOJATVRTKA d.o.o. ORA&#352;JE</NazivPoslodavca>
    <BrojZahtjeva>8</BrojZahtjeva>
    <DatumPodnosenja>2021-01-01</DatumPodnosenja>
  </PodaciOPoslodavcu>
</PaketniUvozObrazaca>

Note that only the root element is explicitly bound to a namespace in the code. The subelements do not need to be in a namespace when they are added. The end result is an XML document (primjer.xml) where all elements belong to the same default namespace.

The above is not the only way to create an element in a namespace. For example, instead of the {namespace-uri}name notation, the QName class can be used. See https://stackoverflow.com/a/58678592/407651.

1 Comment

BRILIANT, thank you a lot!!! This solved my problem!!! The software that I must use to read the xml file, read it and create a empty table. Now I must write the rest of my xml file to se if all elements will appear in the table in the software.
0

The tree.write() method takes a default_namespace argument.

What happens if you change that line to the following?

tree.write(files, default_namespace="urn:PaketniUvozObrazaca_V1_0.xsd")

3 Comments

Sorry for my late response, and thank you for help. If I understood you good this is what I get this kind of error: imgur.com/a/LK0gMrr
A search of SO (or indeed Google) for ValueError: cannot use non-qualified names with default_namespace option produces this answer. @Andomar isn't doing exactly the same thing as you, but does it work to put xml.register_namespace("", "urn:PaketniUvozObrazaca_V1_0.xsd") just before your root=xml.Element("PaketniUvozObrazaca")?
I tried that way, and got no errors. But this didn't solve my problem, because this does not put the schema in my xml, because after this my xml lock like this ( imgur.com/a/cWhSSya ). You see there in not a link in my xml that i wrote, and it must be the same as in the left document, that my software can read the xml I create. I'm trying now with the lxml package to put the xsd, but of course and this is a pain in the ass....

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.