Quantcast
Channel: Active questions tagged selenium - Stack Overflow
Viewing all articles
Browse latest Browse all 98261

Web Scraping with Selenium [duplicate]

$
0
0

I'm trying to to do a script in python that uses selenium to retrieve some information from this website. It seems that Cloudfare blocks the script because it is a bot. Here i post part of the code:

from selenium import webdriverfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.common.keys import Keys from selenium.webdriver.support.wait import WebDriverWaitoptions = webdriver.ChromeOptions()options.add_argument("--disable-blink-features=AutomationControlled")options.add_argument("--incognito")options.add_experimental_option("excludeSwitches", ["enable-automation"])options.add_experimental_option("useAutomationExtension", False)prefs = {"profile.managed_default_content_settings.images": 2,"profile.default_content_setting_values.notifications": 2  }options.add_experimental_option("prefs", prefs)driver = webdriver.Chrome(options=options)driver.get("https://worldwide.espacenet.com/")advanced_search = WebDriverWait(driver, 10).until( lambda x: x.find_element(By.XPATH, "/html/body/div/div/nav/ul/li[5]/label/span")) #toggle to be triggered to enter in advanced search modeadvanced_search.click()

Cloudfare block the page instantly and asks the user to verify that he is a human. Are there some ways to make the website think it is interacting with a real user or some options to add directly in the selenium webdriver?

I also tried to add a script that moves the mouse randomly on the window

EDIT: i want to use library under MIT or Apache license, so i cannot use undetected_chromedriver, even if it works well. Thanks


Viewing all articles
Browse latest Browse all 98261

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>