I am trying to parse an XML file retrieved from OCTranspo (Ottawa City Bus Company) using Python. My problem is that I can't seem to access the sub-fields, such as Latitude and Longitude.
Here is a heavily shortened version of a sample xml file, that still results in the problem:
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<Route xmlns="http://tempuri.org/">
<Trips>
<Trip><TripDestination>Barrhaven Centre</TripDestination
<TripStartTime>19:32</TripStartTime><Latitude>45.285458</Latitude
<Longitude>-75.746786</Longitude></Trip>
</Trips>
</Route>
</soap:Body>
</soap:Envelope>
And here's my code:
import xml.etree.ElementTree as ET
import urllib
u = urllib.urlopen('https://api.octranspo1.com/v1.1/GetNextTripsForStop', 'appID=7a51d100&apiKey=5c5a8438efc643286006d82071852789&routeNo=95&stopNo=3044')
data = u.read()
f = open('route3044.xml', 'wb')
f.write(data)
f.close()
doc = ET.parse('route3044.xml')
for bus in doc.findall('Trip'):
lat = bus.findtext('Latitude')
#NEVER EXECUTES
print trip
If I execute the same code against a very simple xml file (one without the soap:Envelope...) then the code works flawlessly. However, as the xml I need is generated by OCTranspo I can't control the format.
I'm not sure if the issue is a 'namespace' issue or a bug in Python.
Any assistance would be appreciated.
UPDATE: 21-Sept-2013
I changed the code that searches for the Lat and Lon to this:
doc = ET.parse('Stop1A.xml')
for a in doc.findall('{http://schemas.xmlsoap.org/soap/envelope/}Body'):
for b in a.findall('{http://octranspo.com}GetNextTripsForStopResponse'):
for c in b.findall('{http://octranspo.com}GetNextTripsForStopResult'):
for d in c.findall('{http://tempuri.org/}Route'):
for e in d.findall('{http://tempuri.org/}RouteDirection'):
direction = e.findtext('{http://tempuri.org/}Direction')
if direction == 'Eastbound':
for f in e.findall('{http://tempuri.org/}Trips'):
for g in f.findall('{http://tempuri.org/}Trip'):
lat = g.findtext('{http://tempuri.org/}Latitude')
lon = g.findtext('{http://tempuri.org/}Longitude')
print lat + ',' + lon
print 'Done'
End result is that I now can see the 'Eastbound' buses on route 95. I know this code is not pretty, but it works. My next goal will be to optimize with perhaps using namespace tricks.
If anyone cares to try accessing the url, note that it's common to see 'no buses' for 5-7 mins, as the url simply returns the closest 6 buses to the stop. Three buses going Eastbound, and three buses going Westbound. If the closest bus is over 7 mins away then the return is null. The code returns the Lat and Lon of a bus - which I can then plot the location using Google Maps.
Kelly