0

For my master thesis, I am exploring the possibility to extract data from a website via web automation. The steps are as follows:

  1. Sign in to the website ( https://www.metal.com/Copper/201102250376 )
  2. Input username and password
  3. Click sign-in
  4. Change date to 01/01/2020
  5. Scrape the table data generated and then save it to csv file
  6. Save to a specific folder with a specific name on my PC
  7. Run the same sequence to download additional historical price data for other materials in a new tab in the same browser window

I am stuck in steps 5, 6 and 7

from selenium import webdriver

DRIVER_PATH = 'C:\webdriver\chromedriver.exe' driver = webdriver.Chrome(executable_path=DRIVER_PATH, chrome_options=ChromeOptions)

driver.maximize_window()

driver.get('https://www.metal.com/Copper/201102250376')

#Login steps LoginClick1 = driver.find_element_by_css_selector( '#__next > div > div.smm-component-header-en > div.main > div.right > button.button.sign-in')

LoginClick1.click()

user_input = driver.find_element_by_id('user_name') user_input.send_keys('#####')

password_input = driver.find_element_by_id('password') password_input.send_keys('####')

Submit = driver.find_element_by_css_selector( 'body > div:nth-child(17) > div > div.ant-modal-wrap.ant-modal-centered.smm-component-sign-en > div > div.ant-modal-content > div > div > div > div.smm-component-sign-en-content > form > div:nth-child(3) > div > div > span > button')

Submit.click()

time.sleep(2)

#scroll down the point of interest in page driver.execute_script("window.scrollBy(0,1000)", "")

#change currency driver.find_element(By.XPATH,"//img[contains(@class,'icon___BUqam')]").click()

time.sleep(1)

#change date from datepicker

date_input = driver.find_element_by_xpath( '//*[@id="__next"]/div/div[5]/div1/div[7]/div1/div2/div1/span1/div/i')

date_input.click()

action = ActionChains(driver)

action.move_to_element(date_input).send_keys(Keys.BACKSPACE).send_keys( Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).perform()

action.move_to_element(date_input).send_keys("01/01/2020").perform() action.move_to_element(date_input).send_keys(Keys.ENTER).perform()

time.sleep(2)

I am stuck trying to scrape the data from the table generated and then save into a csv file using selenium. See HTML code below table generated

**May 27, 2022** **10,758.75-10,788.43** **10,773.59** **+97.94** **USD/mt**

Any help would be massively appreciated.

Download file using button press Download button

driver.find_element(By.XPATH,"//img[contains(@src,'https://static.metal.com/www.metal.com/4.1.161/static/images/price/download.png')]").click()

time.sleep(1)

driver.find_element(By.XPATH,"//img[contains(@src,'https://static.metal.com/www.metal.com/4.1.161/static/images/price/download_excel.png')]").click()

To save time since I have multiple files/data to download, I am also exploring the possibility of directly saving the file via the download button provided.

  • The problem I encounter is that I am not able to directly specify the filename I want it to be saved as.
  • Upon click, the download button opens a new tab and then closes within seconds to initialize the file download.
  • The file is then downloaded with a materialcode-today's date file naming format.

Have you any idea on how to go about this?

1
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Commented May 27, 2022 at 1:43

1 Answer 1

1

The reason sign in button is not getting clicked is because the xpath //*[@id="__next"]/div/div[3]/div[2]/div[2]/button[2] is incorrect the id of next is the main container div through which we are naviagting to the sign button by providing remaining html nodre structure

Instead you can directly select the sign in button as //button[@class='button sign-in'] based on its class value Refer Image attached

Your solution for sign in would look like

driver = webdriver.Chrome(executable_path='C:\webdrivers\chromedriver.exe')
driver.maximize_window()
driver.get('https://www.metal.com/Nickel/201102250239')
# Click on Sign In
driver.find_element(By.XPATH, "//button[@class='button sign-in']").click()
# Enter username
driver.find_element(By.ID, "user_name").send_keys("your username")
# Enter password
driver.find_element(By.ID, "password").send_keys("your password") 
# Click Sign In
driver.find_element(By.XPATH, "//button[@type='submit']").click()

To scrape data

for element in driver.find_elements_by_class_name("historyBodyRow___1Bk9u"):
 elements =element.find_elements_by_tag_name("div")
 print("Date="+ elements[0].text)
 print("Price Range="+ elements[1].text)
 print("Avg="+ elements[2].text)
 print("Change="+ elements[3].text)
 print("Unit="+ elements[4].text)

Add To CSV

import csv
f = open('Path where you want to store the file', 'w')
writer = csv.writer(f)
for element in driver.find_elements_by_class_name("historyBodyRow___1Bk9u"):
  elements =element.find_elements_by_tag_name("div")
  entry = [elements[0].text ,elements[1].text ,elements[2].text , elements[3].text, elements[4].text]
  writer.writerow(entry)

f.close

Sign up to request clarification or add additional context in comments.

9 Comments

Thanks, that worked. However, I encountered a new problem trying to scrape the generated table data. I do not know how to go about it. stackoverflow.com/q/72399631/14434657
Here's the HTML code: <div class="historyBodyRow___1Bk9u"><div class="" style="padding-left: 0px; flex: 1 1 0%; width: auto; text-align: left;">May 27, 2022</div><div class="" style="padding-left: 6px; width: 30%; text-align: right;">10,758.75-10,788.43</div><div class="" style="padding-left: 6px; width: 20%; text-align: right;">10,773.59</div><div class="up___11LCm" style="padding-left: 6px; width: 20%; text-align: right;">+97.94</div><div class="" style="padding-left: 6px; width: 15%; text-align: right;">USD/mt</div></div>
Hi @Esclass in order to scrape data you will have to loop through all div having class historyBodyRow___1Bk9u
Your solution would look like stackoverflow.com/a/72413918/18132195 Refer Scrape Data section
To stroe data into csv you can use the CSV library. Refer pythontutorial.net/python-basics/python-write-csv-file
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.