0

There's a link on our website that leads to a zip folder. The line in the HTML file for it is shown thus: <p><a href="Data/WillCounty_AddressPoint.zip">Address Points</a> (updated weekly)</p>

The zip folder's name will soon be changed using the current date so that it looks like this: WillCounty_AddressPoint_02212018.zip

How do I change the corresponding line in the HTML?

Using this answer I have a script. It runs with no errors but does not change anything in the HTML file.

import bs4
from bs4 import BeautifulSoup
import re
import time

data = r'\\gisfile\GISstaff\Jared\data.html' #html file location
current_time = time.strftime("_%m%d%Y") #date

#load the file
with open(data) as inf:
    txt = inf.read()
    soup = bs4.BeautifulSoup(txt)

#create new link
new_link = soup.new_tag('link', href="Data/WillCounty_AddressPoint_%m%d%Y.zip")
#insert it into the document
soup.head.append(new_link)

#save the file again
with open (data, "w") as outf:
    outf.write(str(soup))
4
  • What date do you want to put there? A fixed date / the current date of the code running / other? (You say that it's updated "weekly" but you also have current_time = time.strftime("_%m%d%Y") #date). Commented Feb 21, 2018 at 22:28
  • I want to put the current date on it. Because the file in the zip folder will be updated weekly with the corresponding date, the name of the zip will be have to be changed in the HTML too. Commented Feb 21, 2018 at 22:35
  • But that means that time.strftime("_%m%d%Y") gives you an invalid file name 6 days out of 7 for the week? Commented Feb 21, 2018 at 22:37
  • @roganjosh I'm not sure why that would be. It works as it should. Every time it's ran the current date is given to it. Commented Feb 22, 2018 at 15:15

1 Answer 1

1

This is how you could use BeautifulSoup to replace the href attribute.

from bs4 import BeautifulSoup
import time
data = r'data.html' #html file location
#load the file
current_time = time.strftime("_%m%d%Y")
with open(data) as inf:
     txt = inf.read()
soup = BeautifulSoup(txt, 'html.parser')
a = soup.find('a')
a['href'] = ("WillCounty_AddressPoint%s.zip" % current_time)
print (soup)

#save the file again
with open (data, "w") as outf:
    outf.write(str(soup))

Outputs:

<p><a href="WillCounty_AddressPoint_02212018.zip">Address Points</a> (updated weekly)</p>

And writes to the file

UPDATED to use data from supplied file.

from bs4 import BeautifulSoup
import time
data = r'data.html' #html file location
#load the file
current_time = time.strftime("_%m%d%Y")
with open(data) as inf:
     txt = inf.read()
soup = BeautifulSoup(txt, 'html.parser')
# Find the a element you want to change by finding it's text and selecting parent.
a = soup.find(text="Address Points").parent
a['href'] = ("WillCounty_AddressPoint%s.zip" % current_time)
print (soup)
#save the file again
with open (data, "w") as outf:
    outf.write(str(soup))

It will however, take out blank lines but otherwise leave your HTML code exactly as it was.

Using a diff tool to see differences in the original and modified files:

diff data\ \(copy\).html data.html 
77c77
< <p><a href="Data/WillCounty_AddressPoint.zip">Address Points</a> (updated weekly)</p>
---
> <p><a href="WillCounty_AddressPoint_02222018.zip">Address Points</a> (updated weekly)</p>
116,120d115
< 
< 
< 
< 
< 
154d148
< 
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks. It ran with no errors but the output was wrong. It didn't change the line: <p><a href="WillCounty_AddressPoint_02212018.zip">Address Points</a> (updated weekly)</p> (line 87) What it did was add an additional line (to line 25) of the same href attribute. It also altered the whole HTML a bit. Ideally, i need this script to simply change the existing line.
Can you post a link to your whole html file?
Edit: it's in a zip so you can open with a proper viewer. willcogis.org/website2014/gis/Data/data.zip
Updated answer in response to posted HTML file.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.