I want to scrap reviews using Selenium and BeautifulSoup. I used them, not a ready-made solution because I wanted to load the stars. But there were two problems:
It downloads duplicate reviews. I think this is due to the fact that the block with the reviews does not have time to load.
It works too long.
Does anyone know how to solve these?
Thank you in advance for your reply and attention.
start_url = 'https://www.patagonia.com/product/mens-down-sweater-jacket/84674.html?dwvar_84674_color=OXDR&cgid=mens-jackets-vests#start=1'
number_pages = 87
arrow_selector = '.yotpo-icon-right-arrow'
chrome_options = Options()
chrome_options.add_argument("--headless") # Opens the browser up in background
driver = Chrome("chromedriver.exe")
pages = []
with Chrome(options=chrome_options) as browser:
driver.get(start_url)
request = driver.execute_script('return document.body.innerHTML')
pages.append(bs(''.join(request), "html.parser"))
time.sleep(5)
for i in range(number_pages):
driver.find_element_by_css_selector(arrow_selector).click()
time.sleep(5)
request = driver.execute_script('return document.body.innerHTML')
pages.append(bs(''.join(request), "html.parser"))