I have a problem regarding a span tag, that has no id or class.
The larger approach is to extract the text between "ITEM 1. BUSINESS" TO "ITEM 1A. RISK FACTORS" from the link below. However, I can't figure out a way to find this part, because the span it is in, has no id nor a class I can search for (only the parent div the span is in: div = soup.find("div", {"id": "dynamic-xbrl-form"}).
This code does not work, sadly: #text = unicodedata.normalize('NFKD', soup.get_text()).replace('\n', '')
Here is my approach:
url = 'https://www.sec.gov/ix?doc=/Archives/edgar/data/934549/000093454919000017/actg2018123110-k.htm#s62CF0831C63E51C2BEF33F4163F1DE65'
raw = requests.get(url)
soup = BeautifulSoup(raw.content)
div = soup.find("span", {"id": ... })
print(div.txt)
Do you have any ideas or hints?
Thanks a lot Julius