Load multiple pages by web scraping python

Question

I wrote a python code for web scraping so that I can import the data from flipkart.
I need to load multiple pages so that I can import many products but right now only 1 product page is coming.

from urllib.request import urlopen as uReq
from requests import get
from bs4 import BeautifulSoup as soup
import tablib 


my_url = 'https://www.xxxxxx.com/food-processors/pr?sid=j9e%2Cm38%2Crj3&page=1'

uClient2 = uReq(my_url)
page_html = uClient2.read()
uClient2.close()

page_soup = soup(page_html, "html.parser")

containers11 = page_soup.findAll("div",{"class":"_3O0U0u"}) 

filename = "FoodProcessor.csv"
f = open(filename, "w", encoding='utf-8-sig')
headers = "Product, Price, Description \n"
f.write(headers)

for container in containers11:
    title_container = container.findAll("div",{"class":"_3wU53n"})
    product_name = title_container[0].text

    price_con = container.findAll("div",{"class":"_1vC4OE _2rQ-NK"})
    price = price_con[0].text



    description_container = container.findAll("ul",{"class":"vFw0gD"})
    product_description = description_container[0].text


    print("Product: " + product_name)
    print("Price: " + price)
    print("Description" + product_description)
    f.write(product_name + "," + price.replace(",","") +"," + product_description +"\n")

f.close()

rmb · Accepted Answer · 2020-06-20 13:00:21Z

1

You have to check if the next page button exist or not. If yes then return True, go to that next page and start scraping if no then return False and move to the next container. Check for the class name of that button first.

# to check if a pagination exists on the page:
    
    def go_next_page():
        try:
            button = driver.find_element_by_xpath('//a[@class="<class name>"]')
            return True, button
        except NoSuchElementException:
            return False, None

answered Jun 20, 2020 at 13:00

rmb

5835 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Dheeraj Gupta Over a year ago

I dont know how to apply can you please explain in brief with my code

Mandar Autade · Accepted Answer · 2020-06-20 12:21:26Z

0

You can Firstly get the number of pages available and iterate over for each of the pages and parse the data respectively.

Like if you change the URL with respect to page

'https://www.flipkart.com/food-processors/pr?sid=j9e%2Cm38%2Crj3&page=1' which points to page 1
'https://www.flipkart.com/food-processors/pr?sid=j9e%2Cm38%2Crj3&page=2' which points to page 2

answered Jun 20, 2020 at 12:21

Mandar Autade

3562 silver badges15 bronze badges

Comments

Dheeraj Gupta · Accepted Answer · 2020-06-29 16:10:37Z

0

try:
        next_btn = driver.find_element_by_xpath("//a//span[text()='Next']")
        next_btn.click()
    except ElementClickInterceptedException as ec: 
        classes = "_3ighFh"
        overlay = driver.find_element_by_xpath("(//div[@class='{}'])[last()]".format(classes))
        driver.execute_script("arguments[0].style.visibility = 'hidden'",overlay)
        next_btn = driver.find_element_by_xpath("//a//span[text()='Next']")
        next_btn.click()
    
    except Exception as e:
        print(str(e.msg()))
        break
except TimeoutException:
    print("Page Timed Out")

driver.quit()

answered Jun 29, 2020 at 16:10

Dheeraj Gupta

4441 gold badge4 silver badges14 bronze badges

Comments

Farid Mammadaliyev · Accepted Answer · 2022-12-23 10:42:26Z

0

For me, the easiest way is to add an extra loop with the "page" variable:

# just check the number of the last page on the website
page = 1

while page != 10:
     print(f'Scraping page: {page}')
     my_url = 'https://www.xxxxxx.com/food-processors/pr?sid=j9e%2Cm38%2Crj3&page={page}'

     # here add the for loop you already have


     page += 1

This method should work.

answered Dec 23, 2022 at 10:42

Farid Mammadaliyev

1181 silver badge8 bronze badges

Collectives™ on Stack Overflow

Load multiple pages by web scraping python

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related