Goodmorning,
I m trying to put data from this site. i m trying to get the Date, Creators, Relevance, Discription, subject, Audience and Access of every search result and put it in my postgres database. The problem is that the Discription is sometimes missing. So sometimes there are 6 record on a result and sometimes 7 records on a result.
So my question is: how can i make a empty result for Discription if it is not there. Any tips how to do it are welcome!
My script so far is this. It fill the database if there are always 7 records on a result(i tested with three, keep that in mind)
import urllib.parse
import urllib.request
import re
import sys
import psycopg2 as dbapi
url = 'https://easy.dans.knaw.nl/ui/'
values = {'wicket:bookmarkablePage':':nl.knaw.dans.easy.web.search.pages.PublicSearchResultPage',
'q' : 'opgraving'}
data = urllib.parse.urlencode(values)
data = data.encode('utf-8')
headers = {}
headers['User-Agent'] = 'Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17'
req = urllib.request.Request(url,data, headers =headers)
resp = urllib.request.urlopen(req)
respData = resp.read()
saveRecord= open('C:/Users/berend/Desktop/record.txt','w')
record = re.findall(r'<dd>(.*?)</dd>',str(respData))
for item in record:
saveRecord.write("%s\n" % item)
saveRecord.close()
fin = open("C:/Users/berend/Desktop/record.txt",'r')
fit = open("C:/Users/berend/Desktop/record_schoon.txt",'w')
delete_list = ['</em>', '[',']','<em>','</span>', '<span>', '\\n']
for line in fin:
for word in delete_list:
line = line.replace(word, "")
fit.write(line)
fin.close()
fit.close()
open_record= open('C:/Users/berend/Desktop/record_schoon.txt','r')
content = list(open_record)
print(len(content))
open_record.close()
n = 3
for i in range(0, len(content), 3):
q= content[i:i+n]
con = dbapi.connect(database='import', user='postgres', password='xxx')
cur = con.cursor()
cur.execute("INSERT into import VALUES (%s,%s,%s)",q)
con.commit()
The first 3 results:
2000
Groenewoudt, B.J.; Deeben, J.H.C.; Velde, H.M. van der
100% relevant
Na verkennend onderzoek in 1996 en een grootschalige opgraving met uitgebreid bodemkundig
opgraving
Archaeology
Open (registered users)
2001-09
Peters, F.J.C.; Peeters, J.H.M.
100% relevant
opgraving
Archaeology
Open (registered users)
2008
Jacobs, E.; Burnier, C.Y.
100% relevant
OPGRAVING
Archaeology
Open (registered users)