0

I want to get some data from https://kartkatalog.geonorge.no/api/search?limit=10000&text=&facets[0]name=type&facets[0]value=software&mediatype=xml

What I need is the "title" and "GetCapabilitiesUrl" for every record. I have tried playing around with BeautifulSoup, but I can't find the right way to get the data I want.

Does someone know how to proceed with this?

Thanks.

4
  • 2
    Welcome to SO - Please take the tour and read How to Ask to improve, edit and format your questions. Thanks (Showing code you have written would be great) Commented Jan 12, 2021 at 19:36
  • Maybe I’m missing something but your URL seems to provide JSON, not XML. Commented Jan 12, 2021 at 19:57
  • BS4 is overkill for this. It looks like JSON already - could probably get away with requests and the standard library json module Commented Jan 12, 2021 at 20:08
  • Yeah. You are right. Got what I wanted by this: ` import requests import json url = "kartkatalog.geonorge.no/api/…" json_data = requests.get(url).json() antall = json_data["NumFound"] for i in range(antall): tittel = json_data["Results"][i]["Title"] print(tittel)` Commented Jan 12, 2021 at 20:11

1 Answer 1

1

That link you posted looks like a JSON file, not an XML file. You can see the difference here. You can use the json module in python to parse this data.

Once you get a string with the data from the website, you can use json.loads() to convert a string containing a JSON object into a python object.

The following code snippet will put all titles in a variable called titles and a urls in urls

import json
import urllib.request
import ssl

ssl._create_default_https_context = ssl._create_unverified_context
raw_json_string = urllib.request.urlopen("https://kartkatalog.geonorge.no/api/search?limit=10000&text=&facets%5B0%5Dname=type&facets%5B0%5Dvalue=software&mediatype=xml").read()
json_object = json.loads(raw_json_string)

titles = []
urls = []

for record in json_object["Results"]:
    titles.append(record["Title"])
    try:
        urls.append(record["GetCapabilitiesUrl"])
    except:
        pass

When writing the code, you can use an online JSON viewer to help you figure out the elements of dictionaries and lists.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.