Quantcast
Channel: Active questions tagged selenium - Stack Overflow
Viewing all articles
Browse latest Browse all 97762

Web Scraping app using Scrapy / Selenium generates Error: "ModuleNotFoundError 'selenium'"

$
0
0

Good morning!

I have recently started learning Python and moved onto applying the little I know to create the monstrosity seen below.

In brief: I am attempting to scrape SEC Edgar (https://www.sec.gov/edgar/searchedgar/cik.htm) for CIK codes of companies which I want to study in more detail (for now just 1 company to see if it's the right approach).

To scrape the CIK-code, I created a scrapy spider, imported selenium and created 3 functions - 1st to insert "company name" in the input name, 2nd to activate the "submit" button and finally, a function to scrape the CIK code once the submit is activated and return item.

Apart from adding the item to items.py, I haven't changed the middlewares or settings. For some reason, I am getting ModuleNotFoundError for 'selenium', although I have installed the packages and imported selenium & webdriver along with everything else.

I have tried to mess around with indentation and rephrased the code but achieved no improvement.

 import selenium
 from selenium import webdriver
 import scrapy
 from ..items import Sec1Item
 from scrapy import Selector


class SecSpSpider(scrapy.Spider):
    name = 'SEC_sp'
    start_urls = 
['http://https://www.sec.gov/edgar/searchedgar/cik.htm/']


  def parse(self,response):
    company_name = 'INOGEN INC'
    return scrapy.FormRequest.from_response(response, formdata ={
        'company': company_name
    }, callback=self.start_requests())


  def start_requests(self):
    driver = webdriver.Chrome()
    driver.get(self.start_urls)
    while True:
        next_url = driver.find_element_by_css_selector(
            '.search-button'
        )
        try:
            self.parse(driver.page_source)
            next_url.click()
        except:
            break
    driver.close()

  def parse_page(self, response):
    items = Sec1Item()
    CIK_code = response.css('a::text').extract()
    items["CIK Code: "] = Sec1Item

    yield items

I seem not to be able to get over the import selenium error, hence I am not sure about the extent of needed adjustments to the remainder of my spider.

Error message:

"File/Users/user1/PycharmProjects/Scraper/SEC_1/SEC_1/spiders/SEC_sp.py", line 1, in <module>
    import selenium
ModuleNotFoundError: No module named 'selenium'

Thank you for any assistance and help!


Viewing all articles
Browse latest Browse all 97762

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>