1

I have a python code that downloads data from the table contained in a web page to a local csv file. The code has run into an exception saying an unknown error occurred. Please see below for details.

Error Message:

Traceback (most recent call last):
  File "test.py", line 50, in <module>
    wr.writerow([d.text for d in row.find_elements_by_css_selector('td')])
  File "test.py", line 50, in <listcomp>
    wr.writerow([d.text for d in row.find_elements_by_css_selector('td')])
  File "C:\Users\username\PycharmProjects\Web_Scraping\venv\lib\site-packages\selenium\webdriver\remote\webelement.py", line 76, in text
    return self._execute(Command.GET_ELEMENT_TEXT)['value']
  File "C:\Users\username\PycharmProjects\Web_Scraping\venv\lib\site-packages\selenium\webdriver\remote\webelement.py", line 633, in _execute
    return self._parent.execute(command, params)
  File "C:\Users\username\PycharmProjects\Web_Scraping\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Users\username\PycharmProjects\Web_Scraping\venv\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: An unknown error occurred while processing the specified command.

Python code:

import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver import ActionChains
from selenium.common.exceptions import TimeoutException
import time
import csv
from datetime import datetime

# Locate Edge driver
driver = webdriver.Edge(executable_path = "C://Windows//SysWOW64//MicrosoftWebDriver.exe")
driver.maximize_window()
# Using Edge to open the steam website
driver.get("https://partner.steampowered.com")

# Pause the driver for better performance
driver.implicitly_wait(10)

# Enter email address
login_un = driver.find_element_by_id('username').send_keys("")
# Enter password
login_pw = driver.find_element_by_id('password').send_keys("")
# Click sign in tp log in
driver.find_element_by_id('login_btn_signin').click()

# Find the desired link
driver.find_element_by_link_text('Age of Empires II: Definitive Edition').click()
time.sleep(3)

# Locate the link for Current Players
driver.find_element_by_css_selector('#gameDataLeft > div:nth-child(1) > table > tbody > tr:nth-child(9) > td:nth-child(3) > a').click()
time.sleep(5)

# Locate 1 year for Current Players
driver.find_element_by_xpath('/html/body/center/div/div[3]/div[1]/em[1]').click()
# x.click()
time.sleep(3)

# Locate the table element
table = driver.find_element_by_css_selector('body > center > div > div:nth-child(13) > table')

# Open local csv and save data
filename = datetime.now().strftime('C:/Users/username/Desktop/Output/Concurrent_Players_%Y%m%d_%H%M.csv')
with open(filename, 'w', newline='') as csvfile:
    wr = csv.writer(csvfile)
    for row in table.find_elements_by_css_selector('tr'):
        wr.writerow([d.text for d in row.find_elements_by_css_selector('td')])
print("Concurrent_Player data is saved. ")

HTML Source: (Sorry for not being able to provide the URL because this is an internal website. )

<div>
                    <table>
                        <tbody><tr>
                <td></td>
        <td></td>
        <td align="right" title="2019-02-11 to 2020-02-09"><b>Most recent year</b></td>
            <td></td>                                       <!--Expandable percentage column-->
        <td align="right"><b>Daily average during period<b></b></b></td>
        <td align="right"><b>Change vs. previous period</b></td>
        <td align="right"></td>
        <td class="dim" align="right" title="2018-02-12 to 2019-02-10"><b>Previous year</b></td>
        <td></td>                                       <!--Expandable percentage column-->
        <td class="dim" align="right"><b>Previous daily average<b></b></b></td>
    </tr>
        <tr>

    <td>Average daily peak concurrent users         </td>
    <td></td>
    <td align="right">4,032</td>
    <td align="right"></td>
                <td align="right">11</td>

                <td align="right"><span style="color:#B5DB42;">+25971%</span></td>

        <td width="16"></td>
        <td class="dim" align="right">15</td>
        <td class="dim" align="right"></td>
                <td class="dim" align="right" width="100">0</td>

        <td></td>
    </tr>
        <tr>

    <td>Maximum daily peak concurrent users         </td>
    <td></td>
    <td align="right">26,767</td>
    <td align="right"></td>
                <td align="right">74</td>

                <td align="right"><span style="color:#B5DB42;">+51375%</span></td>

        <td width="16"></td>
        <td class="dim" align="right">52</td>
        <td class="dim" align="right"></td>
                <td class="dim" align="right" width="100">0</td>

        <td></td>
    </tr>
        <tr>

    <td>Average daily active users      </td>
    <td></td>
    <td align="right">24,686</td>
    <td align="right"></td>
                <td align="right">68</td>

                <td align="right"><span style="color:#B5DB42;">+70506%</span></td>

        <td width="16"></td>
        <td class="dim" align="right">35</td>
        <td class="dim" align="right"></td>
                <td class="dim" align="right" width="100">0</td>

        <td></td>
    </tr>
        <tr>

    <td>Maximum daily active users      </td>
    <td></td>
    <td align="right">157,231</td>
    <td align="right"></td>
                <td align="right">432</td>

                <td align="right"><span style="color:#B5DB42;">+191645%</span></td>

        <td width="16"></td>
        <td class="dim" align="right">82</td>
        <td class="dim" align="right"></td>
                <td class="dim" align="right" width="100">0</td>

        <td></td>
    </tr>
                    </tbody></table>
                </div>

Screenshot of web UI: enter image description here

The code does generate a csv file as specified but no data is saved due to the error. I have other similar python codes implemented the same way and succeed. However, I'm not able to troubleshoot by myself on this one. I hope the information provided is enough for you to review. Thanks so much in advance!

3 Answers 3

1

Induce WebdriverWait and visibility_of_element_located() and following xpath to identify the table and then find rows and then column values.

table=WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.XPATH,"//table[contains(.,'Average daily peak concurrent users')]")))
for row in table.find_elements_by_xpath(".//tr"):
   rowdata=[col.text for col in row.find_elements_by_xpath(".//td")]
   print(rowdata)

Based on your example its printing following on console.

['', '', 'Most recent year', '', 'Daily average during period', 'Change vs. previous period', '', 'Previous year', '', 'Previous daily average']
['Average daily peak concurrent users', '', '4,032', '', '11', '+25971%', '', '15', '', '0', '']
['Maximum daily peak concurrent users', '', '26,767', '', '74', '+51375%', '', '52', '', '0', '']
['Average daily active users', '', '24,686', '', '68', '+70506%', '', '35', '', '0', '']
['Maximum daily active users', '', '157,231', '', '432', '+191645%', '', '82', '', '0', '']
Sign up to request clarification or add additional context in comments.

7 Comments

I replaced with your suggestion but it still prompts the same error. It seems like it fails at this line: rowdata = [col.text for col in row.find_elements_by_xpath(".//td")]
yes, same error: selenium.common.exceptions.WebDriverException: Message: An unknown error occurred while processing the specified command.
Hi, I thought I knew the cause: There are other tables embedded in this web page as well so ".//tr" and ".//td" might not be specific enough to the compiler. However, your code does specify the table by doing "//table[contains(.,'Average daily peak concurrent users')]". So I still don't know the reason...
Forget about column.Try print row text see what you are getting on console. table=WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.XPATH,"//table[contains(.,'Average daily peak concurrent users')]"))) for row in table.find_elements_by_xpath(".//tr"): print(row.text)
It still fails with the same error. Your suggestion is the same as the other answer by owner @Jortega. Both fail with the same reason. I feel like "table.find_elements_by_xpath(".//tr")" is not working for this table very oddly.
|
1

Since table.find_elements_by_xpath(".//tr") was not working for this table, I used a very stupid way to walk around. It works fine for my case so far.

Updated code (Partial):

filename = datetime.now().strftime('C:/Users/username/Desktop/Output/data_%Y%m%d_%H%M.csv')
with open(filename, 'w', newline='', encoding="utf-8") as csvfile:
   wr = csv.writer(csvfile)

   a = driver.find_element_by_xpath('/html/body/center/div/div[4]/table/tbody/tr[1]/td[1]').text
   b = driver.find_element_by_xpath('/html/body/center/div/div[4]/table/tbody/tr[1]/td[3]').text
   c = driver.find_element_by_xpath('/html/body/center/div/div[4]/table/tbody/tr[1]/td[5]').text
   d = driver.find_element_by_xpath('/html/body/center/div/div[4]/table/tbody/tr[1]/td[6]').text
   e = driver.find_element_by_xpath('/html/body/center/div/div[4]/table/tbody/tr[1]/td[8]').text
   f = driver.find_element_by_xpath('/html/body/center/div/div[4]/table/tbody/tr[1]/td[10]').text
   wr.writerow([a, b, c, d, e, f])

print("Done. ")
driver.quit()

Reasoning:

What I've observed and found so far is that this table has empty td elements in tr. (See the spot where the cursor is at from the screenshot for an example.) Every horizontal cell next to another one has an empty/blank td. The compiler cannot handle empty tds then throws out an exception. So in my code I had to specify the exact td number to scan so it wouldn't time out.

enter image description here

If anyone can come up with the solution that can let the code avoid scanning empty tds or only scan the tds with a solid text/string, it would be an optimal solution.

Comments

0

There might be a timeout or stale element issue. Try getting the elements in the table like this.

#your code
#for row in table.find_elements_by_css_selector('tr'):
        #wr.writerow([d.text for d in row.find_elements_by_css_selector('td')])

table_elements = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located(
            (By.CSS_SELECTOR, 'tr td')))

for row in table_elements:
    print(row.text)
    #wr.writerow(row.text)

3 Comments

I modified my code as suggested but it still fails with the same error.
@an1que I had a typo in wr.writerow. See if it will print in the loop with print(row.text). See the updated answer.
No, it still prompts the same error. But I think I know the potential reason. There are other tables embedded in this web page as well. So specifying the table by doing this: EC.presence_of_all_elements_located( (By.CSS_SELECTOR, 'tr td')) might not be enough to let the compiler know which table the code is looking for since these tables all contain 'tr' and 'td'. Could you help me on this part please, if I was correct?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.