Quantcast
Channel: Active questions tagged selenium - Stack Overflow
Viewing all articles
Browse latest Browse all 97980

Why does Selenium-Wire AWS lambda function driver instantiation fail with Status code 1?

$
0
0

I'm trying to get Selenium-Wire to work in an AWS Lambda. I've seen very few StackOverflow entries about it, but it kinda seems some people were successful. My lambda is stateless and doesn't even need to use any other AWS feature (such as S3). It'd scrape a certain thing an d I'd capture a specific JSON response of a specific AJAX call on a page.

Here is my Dockerfile:

FROM public.ecr.aws/lambda/python:3.9# Should I go with python:3.8 instead?# Install the function's dependencies using file requirements.txt# from your project folder.RUN yum makecache# https://stackoverflow.com/questions/73056540/no-module-named-amazon-linux-extras-when-running-amazon-linux-extras-install-epeRUN yum install -y amazon-linux-extras# https://stackoverflow.com/questions/72077341/how-do-you-install-chrome-on-amazon-linux-2RUN PYTHON=python2 amazon-linux-extras install epel -y# https://stackoverflow.com/questions/72850004/no-package-zbar-available-in-lambda-layerRUN yum makecacheRUN yum install -y chromiumENV CHROMIUM_PATH=/usr/bin/chromium-browser# or RUN yum install -y google-chrome-stable# or https://intoli.com/blog/installing-google-chrome-on-centos/# curl https://intoli.com/install-google-chrome.sh | bash# https://devopsqa.wordpress.com/2018/03/08/install-google-chrome-and-chromedriver-in-amazon-linux-machine/# https://www.usessionbuddy.com/post/How-To-Install-Selenium-Chrome-On-Centos-7/RUN yum install -y chromedriverRUN pip install --upgrade pipCOPY requirements.txt .RUN  pip3 install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"# Copy function codeCOPY app.py ${LAMBDA_TASK_ROOT}# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)CMD [ "app.handler" ]

My requirements.txt, pretty minimal:

selenium-wire==5.1.0

And my Lambda function:

from seleniumwire import webdriverfrom selenium.webdriver.chrome.service import Servicedef handler(event, context):  # https://gist.github.com/rengler33/f8b9d3f26a518c08a414f6f86109863c  # https://github.com/wkeeling/selenium-wire/issues/131  chrome_options = webdriver.ChromeOptions()  chrome_option_list = {"disable-extensions","disable-gpu","no-sandbox","headless", # for Jenkins"disable-dev-shm-usage", # Jenkins"window-size=800x600", # Jenkins"window-size=800,600","disable-setuid-sandbox","allow-insecure-localhost","no-cache","user-data-dir=/tmp/user-data","hide-scrollbars","enable-logging","log-level=0","single-process","data-path=/tmp/data-path","ignore-certificate-errors","homedir=/tmp","disk-cache-dir=/tmp/cache-dir","start-maximized","disable-software-rasterizer","ignore-certificate-errors-spki-list","ignore-ssl-errors",  }  for chrome_option in chrome_option_list:    chrome_options.add_argument(f"--{chrome_option}")  selenium_options = {"request_storage_base_dir": "/tmp", # Use /tmp to store captured data"exclude_hosts": ""  }  ser = Service("/usr/bin/chromedriver")  ser.service_args=["--verbose", "--log-path=test.log"]  driver = webdriver.Chrome(service=ser, options=chrome_options, seleniumwire_options=selenium_options)  # The meat  # ...  return result

I built an image from the docker file and uploaded it to AWS ECR. The Docker image passes the "it works on my machine (TM)" classic test: it scrapes fine in my laptop Docker container. However it returns error when I try to run it as lambda (based on my own image):

START RequestId: 3f767106-e6f5-4c5c-8930-e77b7314eb3b Version: $LATEST[ERROR] WebDriverException: Message: Service /usr/bin/chromedriver unexpectedly exited. Status code was: 1Traceback (most recent call last):  File "/var/task/app.py", line 43, in handler    driver = webdriver.Chrome(service=ser, options=chrome_options, seleniumwire_options=selenium_options)  File "/var/task/seleniumwire/webdriver.py", line 218, in __init__    super().__init__(*args, **kwargs)  File "/var/task/selenium/webdriver/chrome/webdriver.py", line 80, in __init__    super().__init__(  File "/var/task/selenium/webdriver/chromium/webdriver.py", line 101, in __init__    self.service.start()  File "/var/task/selenium/webdriver/common/service.py", line 104, in start    self.assert_process_still_running()  File "/var/task/selenium/webdriver/common/service.py", line 117, in assert_process_still_running    raise WebDriverException(f"Service {self.path} unexpectedly exited. Status code was: {return_code}")END RequestId: 3f767106-e6f5-4c5c-8930-e77b7314eb3bREPORT RequestId: 3f767106-e6f5-4c5c-8930-e77b7314eb3b  Duration: 758.10 ms Billed Duration: 1361 ms    Memory Size: 128 MB Max Memory Used: 91 MB  Init Duration: 602.74 ms    

I was also experimenting with other Chrome switches such as mentioned in selenium.common.exceptions.webdriverexception: message: 'chromedriver.exe' unexpectedly exited.status code was: 1 with no luck. I always get Status code 1, but I couldn't find any documentation what is that exactly. I assume it's some very blatant error.

Does anyone have a working image / Dockerfile + skeleton function I can try?


Viewing all articles
Browse latest Browse all 97980

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>