I can download things from my controlled server in one way - by passing the document ID into a link like so :
https://website/deployLink/442/document/download/$NUMBER
If I navigate to this in my browser, it downloads the file with ID $NUMBER.
The problem is, I have 9,000 files on my server, which is SSL encrypted and usually requires signing in with a username/password on a dialog box popup which appears on the web-page.
I posted a similar thread to this already, where I downloaded the files via WGET. Now I would like to try and use Python, and I'd like to provide the username/password and get through the SSL encryption.
Here is my attempt to grab one file, which results in a 401 error. Full stacktrace below.
import urllib2
import ctypes
from HTMLParser import HTMLParser
# create a password manager
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
# Add the username and password.
top_level_url = "https://website.com/home.html"
password_mgr.add_password(None, top_level_url, "admin", "password")
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
# create "opener" (OpenerDirector instance)
opener = urllib2.build_opener(handler)
# Install the opener.
# Now all calls to urllib2.urlopen use our opener.
urllib2.install_opener(opener)
# Grab website
response = urllib2.urlopen('https://website/deployLink/442/document/download/1')
html = response.read()
class MyHTMLParser(HTMLParser):
url=''https://website/deployLink/442/document/download/1')'
# Save the file
webpage = urllib2.urlopen(url)
with open('Test.doc','wb') as localFile:
localFile.write(webpage.read())
What have I done incorrectly here? Is what I am attempting possible?
C:\Python27\python.exe C:/Users/ADMIN/PycharmProjects/GetFile.py
Traceback (most recent call last):
File "C:/Users/ADMIN/PycharmProjects/GetFile.py", line 22, in <module>
response = urllib2.urlopen('https://website/deployLink/442/document/download/1')
File "C:\Python27\lib\urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 437, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 475, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 401: Processed
Process finished with exit code 1
Here's my authent page with some info removed for privacy :

Authent url ends in :443.