0

I am a rookie programmer and I am teaching myself some webscraping. I am trying to make a Python program that returns the direct video download URL from an embedded player by scraping a webpage with selenium.

So here's the relevant html for the webpage:

<video class="vjs_tech" id="olvideo_html5_api" crossorigin="anonymous"></video>
<button class="vjs-big-play-button" type="button" aria-live="polite" title="Play Video" aria-disabled="false"><span class="vjs-control-text">Play Video</span></button>

The video element initially does not have a src attribute. But when I click the above button on my browser, the page seems to run a few javascripts and the video element gets an src attribute. I want to print the contents of this src attribute to the monitor. So this is how I replicated this process in python:

#Clicking the Button
playbutton = driver.find_element_by_tag_name('button')
playbutton.send_keys(Keys.RETURN)

#Selecting the Video Element
wait = WebDriverWait(driver, 5)
video = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'video')))

#Printing the details of the Video Element
print "Class: ", video.get_attribute("class")
print "ID: ", video.get_attribute("id")
print "SRC: ", video.get_attribute("src")

The output looks like this:

Class: vjs_tech
ID: olvideo_html5_api
SRC: 

As you can see, I can get the 'class' and 'id' info accurately but the 'src' tag always returns empty. But if I use Chrome to open the site and click the button manually, I can see that the src field gets populated as expected.

What am I doing wrong? How can I get the src attribute to show up in my output?

(Im using Selenium with ChromeDriver on Python27.)

3
  • Can you update the question with a sample value of the src attribute? Does the value of the src attribute changes? Is there a pattern of the src attribute Commented Sep 26, 2018 at 19:20
  • Possible duplicate of Using selenium webdriver to wait the attribute of element to change value Commented Sep 26, 2018 at 19:33
  • ^ Use the link above and wait for src to not be empty. Commented Sep 26, 2018 at 19:34

1 Answer 1

2

I guess it takes some time(could be ms) after you click on 'button' and src to appear in video element. Since video element is always present webdriver will get its current state (i.e. no src ). Implicit/explicit wait will not help here, in this case you will need to use time.sleep

import time

#Clicking the Button
playbutton = driver.find_element_by_tag_name('button')
playbutton.send_keys(Keys.RETURN)
time.sleep(5) #<<<<<<<<<<<<<<<to add 5 sec sleep, you can adjust this

#Selecting the Video Element
video = driver.find_element_by_tag_name('video')

#Printing the details of the Video Element
print "Class: ", video.get_attribute("class")
print "ID: ", video.get_attribute("id")
print "SRC: ", video.get_attribute("src")
Sign up to request clarification or add additional context in comments.

1 Comment

This did it for me ... Simple although a bit raw, but a huge day saver!!!!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.