1

I have the following .xml file which I like to manipulate:

<html>
  <A>
    <B>
      <C>
        <D>
          <TYPE>
            <NUMBER>7297</NUMBER>
            <DATA />
          </TYPE>
          <TYPE>
            <NUMBER>7721</NUMBER>
            <DATA>A=1,B=2,C=3,</DATA>
          </TYPE>
        </D>
      </C>
    </B>
  </A>
</html>

I want to change the text inside the <DATA> that lies under the <NUMBER>7721</NUMBER> element. How do I do that? If I use find() or findtext() it is only able to point to the first match.

1
  • "I want to change the text inside the that lies under the 7721." What does that mean? You seem to have left out a word. Commented Jul 26, 2012 at 19:53

1 Answer 1

3

XPath is great for this kind of stuff. //TYPE[NUMBER='7721' and DATA] will find all the TYPE nodes that have at least one NUMBER child with text '7721' and at least one DATA child:

from lxml import etree

xmlstr = """<html>
  <A>
    <B>
      <C>
        <D>
          <TYPE>
            <NUMBER>7297</NUMBER>
            <DATA />
          </TYPE>
          <TYPE>
            <NUMBER>7721</NUMBER>
            <DATA>A=1,B=2,C=3,</DATA>
          </TYPE>
        </D>
      </C>
    </B>
  </A>
</html>"""

html_element = etree.fromstring(xmlstr)

# find all the TYPE nodes that have NUMBER=7721 and DATA nodes
type_nodes = html_element.xpath("//TYPE[NUMBER='7721' and DATA]")

# the for loop is probably superfluous, but who knows, there might be more than one!
for t in type_nodes:
    d = t.find('DATA')
    # example: append spamandeggs to the end of the data text
    if d.text is None:
        d.text = 'spamandeggs'
    else:
        d.text += 'spamandeggs'
print etree.tostring(html_element)

Outputs:

<html>
  <A>
    <B>
      <C>
        <D>
          <TYPE>
            <NUMBER>7297</NUMBER>
            <DATA/>
          </TYPE>
          <TYPE>
            <NUMBER>7721</NUMBER>
            <DATA>A=1,B=2,C=3,spamandeggs</DATA>
          </TYPE>
        </D>
      </C>
    </B>
  </A>
</html>
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! But I only want to change the second data path <DATA>A=1,B=2,C=3,</DATA> so I'm thinking about pinpointing it by using the <NUMBER>7721</NUMBER> instead of the "emptiness" of the <DATA>
@user1546610 Then you just have to change the xpath! I've edited my answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.