Selenium web scraping script not returning expected results

Question

I have a Python script using Selenium for web scraping company information from a website. The script was working properly yesterday, but today it's not returning any results even though I haven't made any changes to the code.

When the script performs a search, the webpage displays "No matches" or no results. However, if I manually perform the same search on the website, there are visible results.

I'm unsure what I'm doing wrong or why the script is no longer working as expected. Any insights or suggestions would be greatly appreciated.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import pandas as pd
from webdriver_manager.chrome import ChromeDriverManager
import re


url = "https://ruesfront.rues.org.co/"

nom_empresa = "LINEAS ESCOLARES Y TURISMO S.A.S"


def extrae_nit(nom_empresa, url):
    options = webdriver.FirefoxOptions()
    driver = webdriver.Firefox(options=options)

    driver.get(url)

    driver.find_element(By.ID, "search") \
        .send_keys(nom_empresa)
    driver.implicitly_wait(10)
    driver.find_element(By.CLASS_NAME,
                        "d-none d-sm-block btn btn-primary input-group-append btn-busqueda busqueda__button--xs".replace(
                            " ", ".")) \
        .click()
    driver.implicitly_wait(30)
    text = driver.find_element(By.CLASS_NAME, "row card-result p-4 bg-featured".replace(" ", ".")) \
        .text

    print(text)

    driver.quit()

    result = text.split('\n')
    id_index = result.index("Identificación")
    nit = result[id_index + 1]

    return nit


print(extrae_nit(nom_empresa,url))

Barry the Platipus · Accepted Answer · 2024-05-06 08:36:31Z

Here is one tested way of getting that information:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,1080")

with webdriver.Chrome(options=chrome_options) as driver:
    wait = WebDriverWait(driver, 15)

    url = 'https://ruesfront.rues.org.co/'
    driver.get(url) 
    comp_data = {}
    wait.until(EC.presence_of_element_located((By.XPATH, '//input[@id="search"]'))).click()
    wait.until(EC.presence_of_element_located((By.XPATH, '//input[@id="search"]'))).send_keys('LINEAS ESCOLARES Y TURISMO S.A.S')
    wait.until(EC.presence_of_element_located((By.XPATH, '//i[@class="bi bi-search ps-2"]'))).click()
    comp_data['sigla'] = wait.until(EC.presence_of_element_located((By.XPATH, '//p[text()="Sigla"]//following-sibling::span'))).text
    comp_data['identif_code'] = wait.until(EC.presence_of_element_located((By.XPATH, '//p[text()="Identificación"]//following-sibling::span'))).text
    comp_data['inscrip_code'] = wait.until(EC.presence_of_element_located((By.XPATH, '//p[text()="Numero de Inscripción"]//following-sibling::span'))).text
    comp_data['category'] = wait.until(EC.presence_of_element_located((By.XPATH, '//p[text()="Categoria"]//following-sibling::span'))).text
    print(comp_data)

Result in terminal:

{'sigla': 'LIDERTUR S.A.S', 'identif_code': '800126471-1', 'inscrip_code': '38610', 'category': 'Sociedad ó persona juridica principal ó esal'}

Selenium documentation can be found here.

Thanks, it's working! I really wanted to know if there are any problems or mistakes in the code I presented below. I'd appreciate any suggestions since I'm new to this field and value any advice.

Collectives™ on Stack Overflow

Selenium web scraping script not returning expected results

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related