Compatible for ChromeDriver
This program seeks to accomplish the following:
- Automatically sign-in to a website;
- Visit a link / link(s) from a text file;
- To scrape data from each page visited this way; and
- Output all scraped data by print().
Kindly skip to Part 2 for the problem area, as part 1 is tested to work for step 1 already. :)
The code:
Part 1
from selenium import webdriver
import time
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()
driver.get("https://www.website1.com/home")
main_page = driver.current_window_handle
time.sleep(5)
##cookies
driver.find_element_by_xpath('//*[@id="CybotCookiebotDialogBodyButtonAccept"]').click()
time.sleep(5)
driver.find_element_by_xpath('//*[@id ="google-login"]/span').click()
for handle in driver.window_handles:
if handle != main_page:
login_page = handle
driver.switch_to.window(login_page)
with open('logindetails.txt', 'r') as file:
for details in file:
email, password = details.split(':')
driver.find_element_by_xpath('//*[@id ="identifierId"]').send_keys(email)
driver.find_element_by_xpath('//span[text()="Next"]').click()
time.sleep(5)
driver.find_element_by_xpath('//input[@type="password"]').send_keys(password)
driver.find_element_by_xpath('//span[text()="Next"]').click()
driver.switch_to.window(main_page)
time.sleep(5)
Part 2
In alllinks.txt, we have the following websites:
• website1.com/otherpage/page1
• website1.com/otherpage/page2
• website1.com/otherpage/page3
with open('alllinks.txt', 'r') as directory:
for items in directory:
driver.get(items)
time.sleep(2)
elements = driver.find_elements_by_class_name('data-xl')
for element in elements:
print ([element])
time.sleep(5)
driver.quit()
The outcome:
[Done] exited with code=0 in 53.463 seconds
... and zero output
The problem:
Location of the element has been verified, am suspecting that the windows have something to do with why the driver is not scraping.
All inputs are welcome and greatly appreciated. :)