0

I've been trying to scrape kith.com search results but I get skeleton sample code. Tried to use scrapy, requests-html and selenium but I haven't managed to make them work.

Right now my code is:

from requests_html import HTMLSession

session = HTMLSession()
r = session.get("https://kith.com/pages/search-results-page?q=nike&tab=products&sort_by=created")

r.html.render()
print(r)

From what I've seen, render() should get the html code as it's seen in a browser but I still get the same "raw" code.

PD: kith.com is a shopify shop

2

1 Answer 1

2

Selenium is suitable for a job like this

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)
driver.get('https://kith.com/pages/search-results-page?q=nike&tab=products&sort_by=created')


item_titles = driver.find_elements_by_class_name("snize-title")

print item_titles[0].text
#NIKE WMNS SHOX TL - NOVA WHITE / TEAM ORANGE / SPRUCE AURA

Edit:

If you want to capture all item info, the div elements with snize-overhidden class will be what you want to capture. Then you may iterate through them and their sub elements

Sign up to request clarification or add additional context in comments.

3 Comments

How would I do this without the computer having to open any browser? My intention is to upload it when the project is finished to AWS so it can run once every couple of hours
The browser can operate in headless mode (it runs in the background). Check the updated answer @NyTrOuS
Copied your code and I'm getting an error stating that 'Options' is not defined

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.