10

I'm trying to use Python 2.7's ElementTree library to parse an XML file, then replace specific element attributes with test data, then save this as a unique XML file.

My idea for a solution was to (1) source new data from a CSV file by reading a file to a string, (2) slice the string at certain delimiter marks, (3) append to a list, and then (4) use ElementTree to update/delete/replace the attribute with a specific value from the list.

I've looked in the ElementTree documentation & saw the clear() and remove() functions, but I have no idea of the syntax to use them adequately.

An example of the XML to modify is below - attributes with XXXXX are to be replaced/updated:

<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="XXXXX">
        <Pty ID="XXXXX" R="1"/>
    </RptSide>
</TrdCaptRpt>

The intended result will be, for example:

<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="12345">
        <Pty ID="ABCDE" R="1"/>
    </RptSide>
</TrdCaptRpt>

How do I use the etree commands to change the base XML to update with an item from the list[]?

0

1 Answer 1

18

For this kind of work, I always recommend BeautifulSoup because it has a really easy to learn API:

from BeautifulSoup import BeautifulStoneSoup as Soup

xml = """
<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="XXXXX">
        <Pty ID="XXXXX" R="1"/>
    </RptSide>
</TrdCaptRpt>
"""

soup = Soup(xml)
rpt_side = soup.trdcaptrpt.rptside
rpt_side['txt1'] = 'Updated'
rpt_side.pty['id'] = 'Updated'

print soup

Example output:

<trdcaptrpt rptid="10000001" transtyp="0">
<rptside side="1" txt1="Updated">
<pty id="Updated" r="1">
</pty></rptside>
</trdcaptrpt>

Edit: With xml.etree.ElementTree you could use the following script:

from xml.etree import ElementTree as etree

xml = """
<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="XXXXX">
        <Pty ID="XXXXX" R="1"/>
    </RptSide>
</TrdCaptRpt>
"""

root = etree.fromstring(xml)
rpt_side = root.find('RptSide')
rpt_side.set('Txt1', 'Updated')
pty = rpt_side.find('Pty')
pty.set('ID', 'Updated')
print etree.tostring(root)

Example output:

<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="Updated">
        <Pty ID="Updated" R="1" />
    </RptSide>
</TrdCaptRpt>
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks very much, once I finally got BS to install properly your recommendation has worked. I am interested in other ways, is there a method using the standard eTree commands?
@NickH I've updated my answer with an example using ElementTree
With a minor modification this solution has worked perfectly for my needs, I amended the set commands to use indexes from a defined list and replaced the fromstring with a etree.parse pointing to my XML file. Many thanks for your advice!!!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.