0
<p style="font-size: small;" class="apple"><a name="XREF_4567_Figure1_1"></a>Assembly, 1234, 456 &amp; 789</p>
<div align="center"><image alt="apple.jpg" id="image2" source="assets/apple.jpg" />
  </div>

In the above html code we need to extract "Assembly, 1234, 456 & 789" and "apple.jpg"

And my python code is below

for line in f:
    if 'div align' in line.lower():
        #get value after class="
        myline=line.split("alt=\"")
        #get value before "
        number=myline[1].split("\"")[0]
        numbers[i].append(number)
#print(count)
#subtract oldcount to find the count of hotspots in current file
count[i].append(0)
count[i].append(len(numbers[i])-oldcount)
i = i + 1
#print(i)
0

1 Answer 1

2

you can use BeautifulSoup for that from library bs4:

from bs4 import BeautifulSoup

html = '<p style="font-size: small;" class="apple"><a name="XREF_4567_Figure1_1"></a>Assembly, 1234, 456 &amp; 789</p><div align="center"><image alt="apple.jpg" id="image2" source="assets/apple.jpg" />  </div>'
bs = BeautifulSoup(html, 'html.parser')
print(bs.find('p').get_text())
print(bs.find('image').get("alt"))
Sign up to request clarification or add additional context in comments.

8 Comments

Hi @dallonsi I have more <p> tags in html file but I need particularly <p> tag including <p style="font-size: small;" class="apple">
If this answer solves your problem, consider marking it as "accepted" by clicking on the check mark icon
the use find_all instead of find if you have many <p> tags
you can use the class_ parameter to filter specific tags based on the class value. example: print(soup.find('p', class_ = 'apple').get_text())
@Shobha, please consider accepting the answer if it resolved your problem and also post new questions for different problems independent of the current question.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.