0

What I'm trying to do is scrape the web page 'http://www.trulia.com/property/1080560259-2-Penelope-Ln-Middletown-NJ-07748'. In this when the tab Estimates (below Comparable and Estimates section) is selected the data below the google map is loaded dynamically. This data is not visible in page source, but at the same time it is visible in Developer Tools window (context menu, Inspect Element).

I'm using Selenium and Python 2.7. Is there a way to access this data? or is there any way to access all the elements?

Thanks in advance.

3
  • See my answer to larger scope question, start from latest code listing and see browser.page_source. The answer is stackoverflow.com/questions/23386855/… Commented May 8, 2014 at 22:11
  • Thanks. But this doesnt resolve my issue. Is there a way to access Elements listed in Dev Tools window. The dynamic data generated is not visible in page source. I couldnt use response package since i dont have a new URL. By default Tab 1 (Comparable) data comes in the source. I need tab 2 (Estimates) table data. Commented May 8, 2014 at 22:55
  • The data I need is visible in the Elements section of Dev Tools window but not in the source. Commented May 8, 2014 at 22:56

1 Answer 1

2

Since that is powered by ajax, you need to account for that yourself.

I'd do something like: (and this is pseudo-code)

find_element_by_css_selector('a#dataset_nearby').click()
waitForElement('ul#places_map_module li.active table.table tr')

You'll probably need to fiddle around with the selectors, but in waitForElement, basically you just need to do a constant check on the element and wait until it's available BEFORE you perform a command on it.

Sign up to request clarification or add additional context in comments.

4 Comments

Hi, thanks for the response. But even when I wait the element is not visible. find below the code section I tried. import selenium.webdriver.support.ui as ui wait = ui.WebDriverWait(driver,30) wait.until(lambda driver:driver.find_element_by_css_selector('a#dataset_nearby')) driver.find_element_by_css_selector('a#dataset_nearby').click() ElementNotVisibleException is thrown.
use find_elements instead, and check the length.. that might help too
Am using find_element_by_id and click() method is called to select the Tab Estimates in the web page. Even with wait time the new data is not available to the browser handler. It throws the same exception. The code I tried is, pick_id = driver.find_element_by_id("dataset_nearby") pick_id.click() wait.until(lambda driver: driver.find_elements_by_css_selector('Home Estimates')) print driver.find_elements_by_css_selector('Home Estimates')
whenever the Tab is clicked, the following request is sent. GET /_ajax/PDP/NearbyProperties/json/?tplname=small&bo...4&lon=-74.10724&block_pid=1080560259&fips_id=34025. How can I make equivalent request in Python?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.