1

I'm using python to parse a XML file but I have a problem. I'm getting the values in form of a dictionary but if there are two or more same values then they are not repeating. I'm sure there is a way to solve it but I'm new on python and parsing XML...

Here is an example of XML:

<Root>
<Child1>
</Child1>
<Child2>
    <Data DId = "1">
        <Group ID = "">
            <Sport Name="Cricket" Team="6" />
            <Sport Name="Football" Team="6" />
            <Sport Name="Hockey" Team="5" />
        </Group>
    </Data>
    <Data DId = "2">
        <Group ID = "">
            <Sport Name="Rugby" Team="6" />
            <Sport Name="Baseball" Team="10" />
            <Sport Name="Swimming" Team="6" />
        </Group>
    </Data>
</Child2>
</Root>

I want to get Sport's tag value separated by Data. The code I have tried is:

import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
dict1 = {}
for i in root.iter('Sport'):
    dict1[i.attrib['Name']] = [j.text for j in i]
    dict1[i.attrib['Team']] = [k.text for k in i]

print(dict1)

But I am not able to get Team value for each sport.

1 Answer 1

1

Try this library.

from simplified_scrapy import SimplifiedDoc, utils
xml = '''
<Root>
<Child1>
</Child1>
<Child2>
    <Data DId = "1">
        <Group ID = "">
            <Sport Name="Cricket" Team="6" />
            <Sport Name="Football" Team="6" />
            <Sport Name="Hockey" Team="5" />
        </Group>
    </Data>
    <Data DId = "2">
        <Group ID = "">
            <Sport Name="Rugby" Team="6" />
            <Sport Name="Baseball" Team="10" />
            <Sport Name="Swimming" Team="6" />
        </Group>
    </Data>
</Child2>
</Root>
'''
# xml = utils.getFileContent('test.xml')
dict1 = {}
doc = SimplifiedDoc(xml)
datas = doc.selects('Data')
for i in datas:
    dic = {}
    for j in i.selects('Sport'):
        dic[j['Name']] = j['Team']
    dict1[i['DId']] = dic
print(dict1)

Result:

{'1': {'Cricket': '6', 'Football': '6', 'Hockey': '5'}, '2': {'Rugby': '6', 'Baseball': '10', 'Swimming': '6'}}
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, @the_train is there a way to get these values separated by "Did" value like I want to get the count of tag present inside each "Group" tag?
@ArenMayank I changed the answer. Do you think it's what you want?
Perfect, Thanks for the suggestion @the_train.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.