0

I am attempting to parse an xml file which I have accomplished and pass the results into an array which will be used later on. The xml is opened read and parsed where I am picking out 3 elements (channel, start and title). As shown in code below, the start is date and time. I am able to split date and time and store in date. As the code loops thru each xml entry I would like to pick out the channel, start and title and store to a multidimensional array. I have done this in Brightscript but can't understand the array or list structure of Python. Once I have all entries in the array or list, I will need to parse that array pulling out all titles and dates with the same date. Can somebody guide me thru this?

xmldoc=minidom.parse (xmldoc)
programmes= xmldoc.getElementsByTagName("programme")
def getNodeText(node):
    nodelist = node.childNodes
    result = []
    for node in nodelist:
        if node.nodeType == node.TEXT_NODE:
            result.append(node.data)
    return ''.join(result)

title = xmldoc.getElementsByTagName("title")[0]
#print("Node Name : %s" % title.nodeName)
#print("Node Value : %s \n" % getNodeText(title))
programmes = xmldoc.getElementsByTagName("programme")

for programme in programmes:
    cid = programme.getAttribute("channel")
    starts=programme.getAttribute("start")
    cutdate=starts[0:15]
    year= int(cutdate[0:4])
    month= int(cutdate[5:6])
    day= int(cutdate[7:8])
    hour= int(cutdate[9:10])
    minute= int(cutdate[11:12])
    sec= int(cutdate[13:14])
    date=datetime(year, month, day,hour, minute, sec)
    title = programme.getElementsByTagName("title")[0]
    print("id:%s, title:%s, starts:%s" %
          (cid, getNodeText(title), starts))
    print (date)
1
  • Can you share input xml file with us? or email me on [email protected] Commented Feb 14, 2015 at 4:28

1 Answer 1

1

Python normally refers to arrays as lists and it looks like what you want is a list of lists (there's an array module and the whole numpy extension with its own arrays, but it doesn't look like you want that:-).

So start the desired list as empty:

results = []

and where you now just print things, append them to the list:

results.append([cid, getNodeText(title), date])

(or whatever -- your indentation is so rambling it would cause tons of syntax errors in Python and confuses me about what exactly you want:-).

Now for the part

I will need to parse that array pulling out all titles and dates with the same date

just sort the results by date:

import operator

results.sort(key=operator.itemgetter(2))

then group by that:

import itertools

for date, items in itertools.groupby(results, operator.itemgetter(2)):
    print(date,[it[1] for it in items])

or whatever else you want to do with this grouping.

You could improve this style in many ways but this does appear to give you the key functionality you're asking for.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.