Python Web Scraping Dynamic Content

Question

I've been trying to scrape kith.com search results but I get skeleton sample code. Tried to use scrapy, requests-html and selenium but I haven't managed to make them work.

Right now my code is:

from requests_html import HTMLSession

session = HTMLSession()
r = session.get("https://kith.com/pages/search-results-page?q=nike&tab=products&sort_by=created")

r.html.render()
print(r)

From what I've seen, render() should get the html code as it's seen in a browser but I still get the same "raw" code.

PD: kith.com is a shopify shop

stackoverflow.com/q/2148493/11301900, stackoverflow.com/q/17608572/11301900, stackoverflow.com/q/8183682/11301900, stackoverflow.com/q/8049520/11301900 — AMC
– AMC, Commented Feb 8, 2020 at 0:40
Does this answer your question? Web-scraping JavaScript page with Python — AMC
– AMC, Commented Feb 8, 2020 at 0:40

Kasem Alsharaa · Accepted Answer · 2020-02-08 00:00:13Z

2

Selenium is suitable for a job like this

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)
driver.get('https://kith.com/pages/search-results-page?q=nike&tab=products&sort_by=created')


item_titles = driver.find_elements_by_class_name("snize-title")

print item_titles[0].text
#NIKE WMNS SHOX TL - NOVA WHITE / TEAM ORANGE / SPRUCE AURA

Edit:

If you want to capture all item info, the div elements with snize-overhidden class will be what you want to capture. Then you may iterate through them and their sub elements

edited Feb 8, 2020 at 0:00

answered Feb 7, 2020 at 23:47

Kasem Alsharaa

9201 gold badge6 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

NyTrOuS Over a year ago

How would I do this without the computer having to open any browser? My intention is to upload it when the project is finished to AWS so it can run once every couple of hours

Kasem Alsharaa Over a year ago

The browser can operate in headless mode (it runs in the background). Check the updated answer @NyTrOuS

NyTrOuS Over a year ago

Copied your code and I'm getting an error stating that 'Options' is not defined

Collectives™ on Stack Overflow

Python Web Scraping Dynamic Content

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related