20

I'm writing a Python script to update Visual Studio project files. They look like this:

<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" DefaultTargets="Build" 
      xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
      ...

The following code reads and then writes the file:

import xml.etree.ElementTree as ET

tree = ET.parse(projectFile)
root = tree.getroot()
tree.write(projectFile,
           xml_declaration = True,
           encoding = 'utf-8',
           method = 'xml',
           default_namespace = "http://schemas.microsoft.com/developer/msbuild/2003")

Python throws an error at the last line, saying:

ValueError: cannot use non-qualified names with default_namespace option

This is surprising since I'm just reading and writing, with no editing in between. Visual Studio refuses to load XML files without a default namespace, so omitting it is not optional.

Why does this error occur? Suggestions or alternatives welcome.

2

3 Answers 3

47

This is a duplicate to Saving XML files using ElementTree

The solution is to define your default namespace BEFORE parsing the project file.

ET.register_namespace('',"http://schemas.microsoft.com/developer/msbuild/2003")

Then write out your file as

tree.write(projectFile,
           xml_declaration = True,
           encoding = 'utf-8',
           method = 'xml')

You have successfully round-tripped your file. And avoided the creation of ns0 tags everywhere.

Sign up to request clarification or add additional context in comments.

1 Comment

This approach works. However, find('mytag', ns) and findall('mytag', ns) methods fail (they return an empty list of elements). It seems that they require a non-empty namespace name. Which is fine, unless you want to write() the XML file with an empty namespace prefix for elements in the default namespace. (Using Python 2.7.)
4

I think that lxml does a better job handling namespaces. It aims for an ElementTree-like interface but uses xmllib2 underneath.

>>> import lxml.etree
>>> doc=lxml.etree.fromstring("""<?xml version="1.0" encoding="utf-8"?>
... <Project ToolsVersion="4.0" DefaultTargets="Build" 
...       xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
...   <PropertyGroup>
...   </PropertyGroup>
... </Project>""")

>>> print lxml.etree.tostring(doc, xml_declaration=True, encoding='utf-8', method='xml', pretty_print=True)
<?xml version='1.0' encoding='utf-8'?>
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="4.0" DefaultTargets="Build">
  <PropertyGroup>
  </PropertyGroup>
</Project>

3 Comments

+1 Though it looks like lxml.etree does not come with Python for Windows, so I'll accept WombatPM's answer
@WombatPM's answer is great too but lxml is available for windows. You can use pip, easy_install and (I think) Active State's pypm... or just grab it from (pypi)[pypi.python.org/pypi/lxml/3.2.3].
I've used lxml as well. Both are good.
0

This was the closest answer I could find to my problem. Putting the:

ET.register_namespace('',"http://schemas.microsoft.com/developer/msbuild/2003")

just before the parsing of my file did not work.

You need to find the specific namespace the xml file you are loading is using. To do that, I printed out the Element of the ET tree node's tag which gave me my namespace to use and the tag name, copy that namespace into:

ET.register_namespace('',"XXXXX YOUR NAMESPACEXXXXXX")

before you start parsing your file then that should remove all the namespaces when you write.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.