I am trying to scrape data from a website using PhantomJS in python as I need to be able to submit a form and then click on a button to go to a page to scrape, but I am only getting empty pages. When i print the page source i get
<html><head></head><body></body></html>
Here is my code, i have explicit waits and also accept any ssl protocol
driver = webdriver.PhantomJS(service_args=['--ignore-ssl-errors=true', '--ssl-protocol=any'])
driver.set_window_size(1280,1024)
driver.get('http://59.180.234.21:85/index.aspx')
try:
delay = 5 #No of seconds
myElem = WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.ID, 'ddlDistrict')))
select = Select(driver.find_element_by_id('ddlDistrict'))
select.select_by_value("165")
try:
myElem = WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.ID, 'ddlPS')))
select = Select(driver.find_element_by_id('ddlPS'))
select.select_by_value('02')
select = Select(driver.find_element_by_id('ddlYear'))
select.select_by_value('2011')
element = driver.find_element_by_id("txtRegNo")
element.send_keys("100")
driver.find_element_by_id("btnSearch").click()
try:
myElem = WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.ID, 'DgRegist_ctl03_imgDelete')))
print ("Record found!", ps_no, year)
#driver.save_screenshot('before.png') //For Bug Testing
driver.find_element_by_id("DgRegist_ctl03_imgDelete").click()
#driver.save_screenshot('after.png') //For Bug Testing
soup=BeautifulSoup(driver.page_source, "html.parser")
csv_writer_data(soup)
except TimeoutException:
print ("No records found!")
except TimeoutException:
print('ddlps not found')
except TimeoutException:
print("ddlDistrict not found")
Not sure why this is not working, any help would be appreciated.