The main issue is that it looks like the site has some sort of Selenium/bot protection. It's loading a blank page after 1-2 pages that can't be refreshed, etc. If this is true, I would respect their wishes and not scrape this site. If this isn't true, I've updated your code with the feedback below.
If you are just looking for a site to practice automation, I would do some googling. Here's a couple that I've found over the years but there are many, many more...
Some feedback:
The item you are closing is not a frame, it's a popup. I updated the variable names.
I changed useless_frame to dismiss_popup and reversed the values because I think it makes the intent clearer.
Rather than reinstantiating a new WebDriverWait each time you use it, it's better to assign it to a variable and then reuse the variable, e.g.
WebDriverWait(driver, 10).until(...)
WebDriverWait(driver, 10).until(...)
becomes
wait = WebDriverWait(driver, 10)
wait.until(...)
wait.until(...)
In Selenium terms, presence means that the element is in the DOM, not that it's necessarily ready to be interacted with. If you are going to click an element, you should use EC.element_to_be_clickable(). If you are going to get text, values, etc. then the element must be visible to avoid errors so you should use EC.visibility_of_element_located().
You can't check if bvc_frame because if that element doesn't exist, your wait will throw a TimeoutException so that check can be removed.
You don't need to click on the "Operaciones" tab because your URL contains ?tab=operaciones which already navigates to the page with that tab selected so that code can be removed.
time.sleep() should be avoided. It's considered a "dumb" sleep. It always waits X seconds even if the element is available sooner. The best practice is to use WebDriverWait, which you're already using elsewhere.
Instead of listing the entire URL, you can just provide the stock name and insert it into the URL because the rest is the same, e.g.
stocks = ['https://www.bvc.com.co/renta-variable-mercado-local/cibest?tab=operaciones', '...']
becomes
stocks = ['cibest', 'pfcibest', '...']
...
for stock in stocks:
driver.get(f'https://www.bvc.com.co/renta-variable-mercado-local/{stock}?tab=operaciones')
If you are going to be scraping more than just a few of these, it would be better/much faster to run these in parallel.
Updating your code with the feedback above,
import selenium
import selenium.webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
driver = selenium.webdriver.Chrome()
stocks = ['cibest',
'pfcibest',
'bogota',
'bhi',
'celsia']
wait = WebDriverWait(driver, 20) # page is slower for me than 10s
dismiss_popup = True
for stock in stocks:
print(stock)
driver.get(f'https://www.bvc.com.co/renta-variable-mercado-local/{stock}?tab=operaciones')
if dismiss_popup: # closing popup
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '.sc-843139d2-14.iwukQD'))).click()
dismiss_popup = False
contado_table = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#accordion__panel-Contado table")))
rows = contado_table.find_elements(By.CSS_SELECTOR, "tr")
for row in rows:
# do something with each table row
print(row.text)
If you want to go really fast, you can separate each stock into a different run with it's own browser and run them in parallel. You convert this script into a data driven test and feed the stocks into the test, e.g.
import pytest
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
@pytest.mark.parametrize("stock",
['cibest',
'pfcibest',
'bogota',
'bhi',
'celsia'])
def test(stock):
driver = webdriver.Chrome()
driver.maximize_window()
wait = WebDriverWait(driver, 20) # page is slower for me than 10s
print(stock)
driver.get(f'https://www.bvc.com.co/renta-variable-mercado-local/{stock}?tab=operaciones')
# closing popup
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '.sc-843139d2-14.iwukQD'))).click()
contado_table = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#accordion__panel-Contado table")))
rows = contado_table.find_elements(By.CSS_SELECTOR, "tr")
for row in rows:
# do something with each table row
print(row.text)
You'll need to look up a tutorial on how to configure/install pytest and run tests in parallel, etc. but that will be your fastest option.
WebDriverWait(driver, wait).until(EC.presence_of_element_located((fool, bar)))will fail due to the blank page. Therefore, it will be impossible to do (scrap) anything until the page displays normally again. But what I want is to extract the data visible in the table (target).