Quantcast
Channel: Active questions tagged selenium - Stack Overflow
Viewing all articles
Browse latest Browse all 98893

Extracting data from embedded PDF files from web pages using python

$
0
0

I am using Selenium to automate Firefox browser to navigate to a particular web page. In that web page, you get an embedded PDF file. I was wondering if there is any way to extract data from the PDF in that page. Here is the code that I am running:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Firefox()
driver.get(
    'http://www.kseb.in/index.php?option=com_wrapper&view=wrapper&Itemid=813&lang=en')
iframe = driver.find_element_by_id("blockrandom")
driver.switch_to.frame(iframe)
s = Select(driver.find_element_by_id('office'))
s.select_by_value('5617')
driver.find_element_by_id('t_consumer-no_5').send_keys('11230')
driver.find_element_by_xpath(
    '/html/body/form/table/tbody/tr[4]/td[3]/input').click()
driver.switch_to.default_content()
iframe = driver.find_element_by_id("blockrandom")
driver.switch_to.frame(iframe)
WebDriverWait(driver, 10).until(EC.element_to_be_clickable(
    (By.ID, "download"))).click()



Ideally, I would like to obtain the value of a particular row in the table shown on the page. You would be able to view the page after running the code. I am using a Linux machine(elementary OS Juno)

Or, how would I go about automating the download(automatically clicking ok when the download pop-up shows) and then extract data from the downloaded PDF?

Thanks

N P


Viewing all articles
Browse latest Browse all 98893

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>