0

I am scrapping some data from this platform. To perform actions, i am using browser simulation tool selenium with python. I want to select drop-down value from a menu but it has been developed as a table, so i am not able to select the element right way. Details are mentioned below:

Element to be selected

The HTML of the element is here :

<table class="dijit dijitReset dijitInline dijitLeft dijitDownArrowButton dijitSelect dijitValidationTextBox" data-dojo-attach-point="_buttonNode,tableNode,focusNode,_popupStateNode" cellspacing="0" cellpadding="0" role="listbox" aria-haspopup="true" tabindex="0" id="dijit_form_Select_0" widgetid="dijit_form_Select_0" aria-expanded="false" aria-invalid="false" style="user-select: none;" popupactive="true" aria-owns="dijit_form_Select_0_menu"><tbody role="presentation"><tr role="presentation"><td class="dijitReset dijitStretch dijitButtonContents" role="presentation"><div class="dijitReset dijitInputField dijitButtonText" data-dojo-attach-point="containerNode,textDirNode" role="presentation"><span role="option" class="dijitReset dijitInline dijitSelectLabel dijitValidationTextBoxLabel ">Active EPA/LA (239)</span></div><div class="dijitReset dijitValidationContainer"><input class="dijitReset dijitInputField dijitValidationIcon dijitValidationInner" value="Χ " type="text" tabindex="-1" readonly="readonly" role="presentation"></div><input type="hidden" data-dojo-attach-point="valueNode" value="Active EPA/LA" aria-hidden="true"></td><td class="dijitReset dijitRight dijitButtonNode dijitArrowButton dijitDownArrowButton dijitArrowButtonContainer" data-dojo-attach-point="titleNode" role="presentation"><span class="dijitReset dijitInputField dijitArrowButtonInner"></span></td></tr></tbody></table>

<tr role="presentation"><td class="dijitReset dijitStretch dijitButtonContents" role="presentation"><div class="dijitReset dijitInputField dijitButtonText" data-dojo-attach-point="containerNode,textDirNode" role="presentation"><span `role="option" class="dijitReset dijitInline dijitSelectLabel dijitValidationTextBoxLabel ">Active EPA/LA (239)</span></div><div class="dijitReset dijitValidationContainer"><input class="dijitReset dijitInputField dijitValidationIcon dijitValidationInner" value="Χ " type="text" tabindex="-1" readonly="readonly" role="presentation"></div><input type="hidden" data-dojo-attach-point="valueNode" value="Active EPA/LA" aria-hidden="true"></td><td class="dijitReset dijitRight dijitButtonNode dijitArrowButton dijitDownArrowButton dijitArrowButtonContainer" data-dojo-attach-point="titleNode" role="presentation"><span class="dijitReset dijitInputField dijitArrowButtonInner"></span></td></tr>`

The approach i am using:

# -*- coding utf-8 -*-
from selenium.webdriver.firefox.options import Options
from selenium import webdriver
import time
import os
import shutil
import uuid

from selenium.webdriver.support.select import Select


class crawlOcean():

    def __init__(self):
        print("hurray33")
        global downloadDir
        downloadDir = ""

        fp = webdriver.FirefoxProfile()
        fp.set_preference("browser.download.folderList", 2)
        fp.set_preference("browser.download.manager.showWhenStarting", False)
        fp.set_preference("browser.download.dir", downloadDir)
        fp.set_preference("browser.helperApps.neverAsk.saveToDisk",
                          "attachment/csv")
        options = Options()
        options.add_argument("--headless")
        self.driver = webdriver.Firefox(firefox_profile=fp)
        #self.driver = webdriver.Firefox()
        print("hurray")
        self.driver.implicitly_wait(15)
        self.driver.get("http://www.epa.ie/hydronet/#Water%20Levels")
        self.verificationErrors = []
        self.accept_next_alert = True

    def crawl(self):
        print("see")
        driver = self.driver
        driver.execute_script("window.scrollTo(0, 800)")
        driver.find_element_by_id("dijit_MenuItem_3_text").click()
        select = driver.find_element_by_xpath(
            "(.//*[normalize-space(text()) and normalize-space(.)='Station status by owner:'])[1]/following::td[2]")
        select.click()

if __name__ == '__main__':
    obj = crawlOcean()
    obj.crawl()

can any one help ? Thanks

1 Answer 1

1

You can try below code to select required value:

driver.find_element_by_xpath('//td[.="All"]').click()
driver.find_element_by_xpath('//td[.="Active EPA/LA (239)"]').click()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.