3

I have an XML file which looks like this:

XML File

I want to fetch following information from this file for all the events:

Under category event:

  • start_date
  • end_date
  • title

Under category venue:

  • address
  • address_2/
  • city
  • latitude
  • longitude
  • name
  • postal_code

and then store this information in a mongodb database. I don't have much experience in parsing. Can someone please help me with this! Thanks !

2 Answers 2

5

Here's an example that parses an xml from the url using lxml and inserts the data into mongodb using pymongo:

from urllib2 import urlopen
import pymongo
from lxml import etree


# parse xml file
root = etree.parse(urlopen('https://www.eventbrite.com/xml/event_search?app_key=USO53E2ZHT6LM4D5RA&country=DE&max=100&date=Future&page=1'))

events = []
for event in root.xpath('.//event'):
    event = {'start_date': event.find('start_date').text,
             'end_date': event.find('end_date').text,
             'title': event.find('title').text}
    events.append(event)


# insert the date into MongoDB
db = pymongo.MongoClient()
collection = db.test

collection.insert(events)

Parsing venue items is left for you as a "homework".

Note that there are other xml parsers out there, like:

Hope that helps.

Sign up to request clarification or add additional context in comments.

Comments

4
from pymongo import MongoClient
import xml.etree.ElementTree as ET
from urllib2 import urlopen

cl = MongoClient()
coll = cl["dbname"]["collectionname"]

tree = ET.parse("https://www.eventbrite.com/xml/event_search?app_key=USO53E2ZHT6LM4D5RA&country=DE&max=100&date=Future&page=1")
root = tree.getroot()

for event in root.findall("./event"):
    doc = {}
    for c in event.getchildren():
        if c.tag in ("start_date", "end_date", "title"):
            doc[c.tag] = c.text
        elif c.tag == "venue":
            doc[c.tag] = {}
            for v in c.getchildren():
                if v.tag in ("address", "address_2", "city", "latitude", "longitude", "name", "postal_code"):
                    doc[c.tag][v.tag] = v.text

    coll.insert(doc)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.