Hi, Now that <a href="https://aws.amazon.com/blogs/aws/new-for-aws-l

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<div class="highlight highlight-source-python notranslate position-relative overflow-au

Running in AWS Lambda Containers about serverless-chrome HOT 3 OPEN

tomardern commented on May 25, 2024 3

Running in AWS Lambda Containers

from serverless-chrome.

Comments (3)

JasperHG90 commented on May 25, 2024 2

Yep. I got it to work.

You need to use "/tmp" for downloads, and you need to call the 'driverEnableHeadlessDownloads' function to enable headless chrome to be able to download files if you want that (see link in function for source). I pinned my selenium version to selenium==3.141 and use Python 3.7.9. Chromedriver and headless chrome versions are also pinned (see dockerfile below for versions).

I use the following python/selenium functions to set up the driver:

def driverEnableHeadlessDownloads(driver: webdriver, downloadDir: str) -> webdriver:
    """
    Need this voodoo function to allow serverless chrome downloads.
     From: https://github.com/shawnbutton/PythonHeadlessChrome/blob/master/driver_builder.py
    Parameters
    ----------
    driver: selenium webdriver
    downloadDir: directory used for downloads
    Returns
    -------
    selenium webdriver
    """
    driver.command_executor._commands["send_command"] = (
        "POST",
        "/session/$sessionId/chromium/send_command",
    )
    params = {
        "cmd": "Page.setDownloadBehavior",
        "params": {"behavior": "allow", "downloadPath": downloadDir},
    }
    driver.execute("send_command", params)


def makeDefaultChromeOptions() -> webdriver.ChromeOptions:
    """
    Set up default chrome options
    Returns
    -------
    selenium webdriver
    """
    options = webdriver.ChromeOptions()
    options.add_argument("--headless")
    options.add_argument("--disable-gpu")
    options.add_argument("--window-size=1280x1696")
    options.add_argument("--disable-application-cache")
    options.add_argument("--disable-infobars")
    options.add_argument("--no-sandbox")
    options.add_argument("--hide-scrollbars")
    options.add_argument("--enable-logging")
    options.add_argument("--log-level=0")
    options.add_argument("--single-process")
    options.add_argument("--ignore-certificate-errors")
    options.add_argument("--disable-dev-shm-usage")
    options.add_argument("--homedir=/var/task")
    options.add_argument(
        "user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (HTML, like Gecko) "
        "Chrome/61.0.3163.100 Safari/537.36"
    )
    return options
    
class Driver:
    def __init__(self, chromeDriver: str, prefs: dict, headlessChromeBinary: str):
        if not pathlib.Path(chromeDriver).exists():
            raise FileNotFoundError(f"Chrome driver not found at {chromeDriver}")
        self.chromeDriver = chromeDriver
        self.prefs = prefs
        self.options = makeDefaultChromeOptions()
        self.options.add_experimental_option("prefs", prefs)
        self.options.binary_location = headlessChromeBinary
        self.driver = None

    def __enter__(self):
        logger.info(
            f"Setting up headless chrome-based browser with preferences {self.prefs}"
        )
        self.driver = webdriver.Chrome(self.chromeDriver, options=self.options)
        driverEnableHeadlessDownloads(self.driver, "/tmp")
        return self.driver

    def __exit__(self, excType, excVal, excTb):
        logger.info("Shutting down driver")
        self.driver.close()
        
 chromePrefs = {
            "download.default_directory": chromeDownloadPath,
            "download.prompt_for_download": False,
            "download.directory_upgrade": True,
            "safebrowsing.enabled": False,
        }

This is the Dockerfile I use for deployment:

FROM public.ecr.aws/lambda/python:3.7

RUN mkdir -p /opt/bin && mkdir -p /opt/extensions && mkdir /var/task/.downloads \
        && curl -SL https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-55/stable-headless-chromium-amazonlinux-2017-03.zip \
         > /opt/bin/headless-chromium.zip \
        && unzip /opt/bin/headless-chromium.zip -d /opt/bin && rm /opt/bin/headless-chromium.zip \
        && curl -SL https://chromedriver.storage.googleapis.com/2.43/chromedriver_linux64.zip > /opt/bin/chromedriver.zip \
        && unzip /opt/bin/chromedriver.zip -d /opt/bin && rm /opt/bin/chromedriver.zip \
        && chmod 777 /opt/bin/chromedriver

# Add poetry files
ADD poetry.lock /var/task
ADD pyproject.toml /var/task

RUN pip install --upgrade pip \
        && pip install poetry --no-cache-dir \
        # Export requirements from poetry project
        && poetry export -f requirements.txt --output /var/task/requirements.txt \
        && pip uninstall -y poetry \
        && pip install -r requirements.txt --target /var/task --no-cache-dir \
        && pip install awslambdaric --target /var/task --no-cache-dir

ADD awsLambda /var/task

CMD [ "main.handler" ]

And this is my pulumi function to create the lambda

lambdaFunction = lambda_.Function(
        resource_name="myLambda",
        image_uri="XXXXXXXXX.dkr.ecr.XXXXX.amazonaws.com"
        f"/myLambda:latest-prod",
        memory_size=1024,
        role=role.arn,
        package_type="Image",
        description="This lambda does things.",
        timeout=500,
        tags={
            "environment": "prod",
            "creator": "pulumi",
            "project": "myLambda",
            "project-url": "https://github.com/XXXXXXX/XXXXXXX",
            "maintainer": "myname",
            "maintainer-email": "[email protected]",
        },
    )

I test the lambda function locally by using the awslambdaric python module. After building the dockerfile, I call:

docker run -d -v ~/.aws-lambda-rie:/aws-lambda -p 9000:8080 \
  --entrypoint /aws-lambda/aws-lambda-rie \
  --env-file .temp/.env \
   docker.io/myorg/myimg \
   /var/lang/bin/python -m awslambdaric main.handler ## 'main' is my lambda file, 'handler' is the lambda name

Firing curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{}' in a terminal invokes the lambda. I usually just call docker ps, note down the container id, and then call docker logs on it.

Hope this helps someone!

from serverless-chrome.

umihico commented on May 25, 2024 2

@tomardern
I could make it. Please visit my repository https://github.com/umihico/docker-selenium-lambda

from serverless-chrome.

kajrolkar commented on May 25, 2024

chromePrefs = {
            "download.default_directory": chromeDownloadPath,
            "download.prompt_for_download": False,
            "download.directory_upgrade": True,
            "safebrowsing.enabled": False,
        }

May i know which downlaod path i have to provide ?
Is /var/task/.Download

from serverless-chrome.

Running in AWS Lambda Containers about serverless-chrome HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent