Comments (3)
Yep. I got it to work.
You need to use "/tmp" for downloads, and you need to call the 'driverEnableHeadlessDownloads' function to enable headless chrome to be able to download files if you want that (see link in function for source). I pinned my selenium version to selenium==3.141
and use Python 3.7.9. Chromedriver and headless chrome versions are also pinned (see dockerfile below for versions).
I use the following python/selenium functions to set up the driver:
def driverEnableHeadlessDownloads(driver: webdriver, downloadDir: str) -> webdriver:
"""
Need this voodoo function to allow serverless chrome downloads.
From: https://github.com/shawnbutton/PythonHeadlessChrome/blob/master/driver_builder.py
Parameters
----------
driver: selenium webdriver
downloadDir: directory used for downloads
Returns
-------
selenium webdriver
"""
driver.command_executor._commands["send_command"] = (
"POST",
"/session/$sessionId/chromium/send_command",
)
params = {
"cmd": "Page.setDownloadBehavior",
"params": {"behavior": "allow", "downloadPath": downloadDir},
}
driver.execute("send_command", params)
def makeDefaultChromeOptions() -> webdriver.ChromeOptions:
"""
Set up default chrome options
Returns
-------
selenium webdriver
"""
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1280x1696")
options.add_argument("--disable-application-cache")
options.add_argument("--disable-infobars")
options.add_argument("--no-sandbox")
options.add_argument("--hide-scrollbars")
options.add_argument("--enable-logging")
options.add_argument("--log-level=0")
options.add_argument("--single-process")
options.add_argument("--ignore-certificate-errors")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--homedir=/var/task")
options.add_argument(
"user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (HTML, like Gecko) "
"Chrome/61.0.3163.100 Safari/537.36"
)
return options
class Driver:
def __init__(self, chromeDriver: str, prefs: dict, headlessChromeBinary: str):
if not pathlib.Path(chromeDriver).exists():
raise FileNotFoundError(f"Chrome driver not found at {chromeDriver}")
self.chromeDriver = chromeDriver
self.prefs = prefs
self.options = makeDefaultChromeOptions()
self.options.add_experimental_option("prefs", prefs)
self.options.binary_location = headlessChromeBinary
self.driver = None
def __enter__(self):
logger.info(
f"Setting up headless chrome-based browser with preferences {self.prefs}"
)
self.driver = webdriver.Chrome(self.chromeDriver, options=self.options)
driverEnableHeadlessDownloads(self.driver, "/tmp")
return self.driver
def __exit__(self, excType, excVal, excTb):
logger.info("Shutting down driver")
self.driver.close()
chromePrefs = {
"download.default_directory": chromeDownloadPath,
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"safebrowsing.enabled": False,
}
This is the Dockerfile I use for deployment:
FROM public.ecr.aws/lambda/python:3.7
RUN mkdir -p /opt/bin && mkdir -p /opt/extensions && mkdir /var/task/.downloads \
&& curl -SL https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-55/stable-headless-chromium-amazonlinux-2017-03.zip \
> /opt/bin/headless-chromium.zip \
&& unzip /opt/bin/headless-chromium.zip -d /opt/bin && rm /opt/bin/headless-chromium.zip \
&& curl -SL https://chromedriver.storage.googleapis.com/2.43/chromedriver_linux64.zip > /opt/bin/chromedriver.zip \
&& unzip /opt/bin/chromedriver.zip -d /opt/bin && rm /opt/bin/chromedriver.zip \
&& chmod 777 /opt/bin/chromedriver
# Add poetry files
ADD poetry.lock /var/task
ADD pyproject.toml /var/task
RUN pip install --upgrade pip \
&& pip install poetry --no-cache-dir \
# Export requirements from poetry project
&& poetry export -f requirements.txt --output /var/task/requirements.txt \
&& pip uninstall -y poetry \
&& pip install -r requirements.txt --target /var/task --no-cache-dir \
&& pip install awslambdaric --target /var/task --no-cache-dir
ADD awsLambda /var/task
CMD [ "main.handler" ]
And this is my pulumi function to create the lambda
lambdaFunction = lambda_.Function(
resource_name="myLambda",
image_uri="XXXXXXXXX.dkr.ecr.XXXXX.amazonaws.com"
f"/myLambda:latest-prod",
memory_size=1024,
role=role.arn,
package_type="Image",
description="This lambda does things.",
timeout=500,
tags={
"environment": "prod",
"creator": "pulumi",
"project": "myLambda",
"project-url": "https://github.com/XXXXXXX/XXXXXXX",
"maintainer": "myname",
"maintainer-email": "[email protected]",
},
)
I test the lambda function locally by using the awslambdaric python module. After building the dockerfile, I call:
docker run -d -v ~/.aws-lambda-rie:/aws-lambda -p 9000:8080 \
--entrypoint /aws-lambda/aws-lambda-rie \
--env-file .temp/.env \
docker.io/myorg/myimg \
/var/lang/bin/python -m awslambdaric main.handler ## 'main' is my lambda file, 'handler' is the lambda name
Firing curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{}'
in a terminal invokes the lambda. I usually just call docker ps
, note down the container id, and then call docker logs
on it.
Hope this helps someone!
from serverless-chrome.
@tomardern
I could make it. Please visit my repository https://github.com/umihico/docker-selenium-lambda
from serverless-chrome.
chromePrefs = { "download.default_directory": chromeDownloadPath, "download.prompt_for_download": False, "download.directory_upgrade": True, "safebrowsing.enabled": False, }
May i know which downlaod path i have to provide ?
Is /var/task/.Download
from serverless-chrome.
Related Issues (20)
- Error: Unable to start Chrome HOT 6
- Unable to run selenium in AWS Lambda with Python 3.8 HOT 4
- Circle CI broken on upload release to github HOT 1
- Release 1.0.0-70 HOT 2
- serverless-chrome gets incomplete source in Lambda HOT 2
- unicode characters are displayed as boxes in the generated pdf
- FATAL:zygote_communication_linux.cc(254)] Cannot communicate with zygote HOT 1
- latest version (1.0.0-70) not available on npm HOT 6
- WebDriverException: Message: Service /opt/chromedriver unexpectedly exited. Status code was: 127 HOT 6
- execute_cdp_cmd not working with serverless-chrome
- Uploaded file must be a non-empty zip (Service: Lambda, Status Code: 400
- ENOENT error
- Lambda AWS - Error: connect ECONNREFUSED 127.0.0.1:9222 HOT 6
- Failed to load GLES library with the latest prebuild binary HOT 2
- /opt/chromedriver unexpectedly exited. Status code was 127\n HOT 3
- Why isnt new 2022 versions of headless-chrome? HOT 3
- Update to Python 3.9 HOT 1
- [QUESTION] Difference between serverless-plugin-chrome vs @sparticuz/chromium
- Which chrome driver version to use for v1.0.0-57 HOT 1
- Cleanup non issue
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from serverless-chrome.