2

I have been doing some modifications on an .xml file using python and lxml library and ElementTree. I have this result:

<component xmlns:xsi="http://www.w3.orgr">
  <memoryMaps>
    <memoryMap>
      <name>name</name>
      <description>description</description>
      <peripheral>
        <name>periph</name>
        <description>description</description>
        <baseAddress>0x0</baseAddress>
        <range>0x8</range>
        <width>32</width>
        <registers>
          <register>
            <name>reg1</name>
            <displayName>1</displayName>
            ....
          </register>                           
          <register>
            <name>reg2</name>
            <displayName>1</displayName>
              .................
           </register>
           <register>
            <name>reg3</name>
            <displayName>1</displayName>
             ..................
           </register>
       </registers>      
      </peripheral>
    </memoryMap>
  </memoryMaps>
</component>

what I want now is in every 'register' to have 'name' and 'displayName' with same text (by copying the text of name in the displayName) like this:

<registers>
      <register>
        <name>reg1</name>
        <displayName>reg1</displayName>
        ....
      </register>                           
      <register>
        <name>reg2</name>
        <displayName>reg2</displayName>
          .................
       </register>
       <register>
        <name>reg3</name>
        <displayName>reg3</displayName>
         ..................
       </register>
   </registers>   

I tried a code like this after parsing my file:

 for register in root.findall('.//register'):  
    tempo = register.find('.//name').text    
    for EL in root.iter('displayName'):
        EL.text = tempo

This seems to replace only in the last register correctly, and the rest of registers have wrong display name. I know I have a problem with my loop maybe?

Please advice Thank you!

2 Answers 2

2
from lxml import etree

root = etree.parse(r'<your file.xml>')

for name in root.xpath('//name[./following-sibling::displayName]'):
    name.getnext().text = name.text

print( etree.tostring(root, pretty_print=True).decode('utf-8') )

Prints:

<component xmlns:xsi="http://www.w3.orgr">
  <memoryMaps>
    <memoryMap>
      <name>name</name>
      <description>description</description>
      <peripheral>
        <name>periph</name>
        <description>description</description>
        <baseAddress>0x0</baseAddress>
        <range>0x8</range>
        <width>32</width>
        <registers>
          <register>
            <name>reg1</name>
            <displayName>reg1</displayName>
            ....
          </register>                           
          <register>
            <name>reg2</name>
            <displayName>reg2</displayName>
              .................
           </register>
           <register>
            <name>reg3</name>
            <displayName>reg3</displayName>
             ..................
           </register>
       </registers>      
      </peripheral>
    </memoryMap>
  </memoryMaps>
</component>
Sign up to request clarification or add additional context in comments.

2 Comments

It is possible you give me more references or link to look on using this method like what is opposite for 'following-sibling' to reference the element before ? This seems interesting!
@Imen It's xpath - there are many tutorials online, but I find this one useful (because I've already know CSS) devhints.io/xpath
1

Recommend you a simple library.

from simplified_scrapy import SimplifiedDoc, utils
# xml = utils.getFileContent('your xml path')
xml = '''
        <registers>
          <register>
            <name>reg1</name>
            <displayName>1</displayName>
            ....
          </register>                           
          <register>
            <name>reg2</name>
            <displayName>1</displayName>
              .................
           </register>
           <register>
            <name>reg3</name>
            <displayName>1</displayName>
             ..................
           </register>
       </registers>
'''
doc = SimplifiedDoc(xml)  # create doc
registers = doc.selects('register')

for r in registers:
    r.displayName.setContent(r.name.html)

# Or
names = doc.selects('register>name')

for n in names:
    n.setContent(n.next.html)

    # Or
    # n.setContent(n.getNext('displayName').html)

print(doc.html)

Result:

    <registers>
      <register>
        <name>reg1</name>
        <displayName>reg1</displayName>
        ....
      </register>                           
      <register>
        <name>reg2</name>
        <displayName>reg2</displayName>
          .................
       </register>
       <register>
        <name>reg3</name>
        <displayName>reg3</displayName>
         ..................
       </register>
   </registers>

Here are more examples. This lib is easy to use.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.