kikobeats / aws-lambda-chrome Goto Github PK

View Code? Open in Web Editor NEW

55.0 4.0 3.0 374.97 MB

Chrome binary compatible with AWS Lambda.

License: MIT License

JavaScript 100.00%

aws-lambda-chrome's Introduction

DEPRECATE: USE https://github.com/alixaxel/chrome-aws-lambda INSTEAD

@browserless/aws-lambda-chrome

Chrome (v72.0.3582.0) binary compatible with AWS Lambda.

Install

$ npm install @browserless/aws-lambda-chrome --save

This package content a binary compressed version of Chrome compatible with AWS Lambda.

Usage

Note: Ensure to use --disable-dev-shm-usage flag.

It has been designed to be used with puppeteer, specially using browserless.

For example, let create a get-browserless.js file with the follow content:

const browserless = require('browserless')

const getChromePath = require('@browserless/aws-lambda-chrome')({
  path: '/tmp'
})

const isLambda = !!process.env.LAMBDA_TASK_ROOT
const getExecutablePath = async () => (isLambda ? getChromePath() : undefined)

module.exports = async () =>
  browserless({
    ignoreHTTPSErrors: true,
    args: [
      '--disable-gpu',
      '--single-process',
      '--no-zygote',
      '--no-sandbox',
      '--disable-dev-shm-usage',
      '--hide-scrollbars'
    ],
    executablePath: await getExecutablePath()
  })

Then in your code just call the snippet, like:

;(async () => {
  const getBrowserless = require('./get-browserless')
  const { screenshot } = await getBrowserless() // serverless!
})()

The first time it will be used it will decompress the file. After that, the decompress version will be reused at next successive calls.

API

constructor([options])

options

path

Type: string
Default: 'dist'

Where extract the binary.

FAQ

Q: How to use a newer version of Node on AWS Lambda?

Currently, AWS Lambda supports Node.js 6.10 and v8.10.

We recommend using the last Node.js version available with your Lambda function.

If you need to provide your own Node.js executable, here a bash script for do that:

#!/bin/bash

NODE_BIN=`curl -sL https://semver.io/node/resolve/8`-linux-x64

echo "Downloading $NODE_BIN binary file"

curl -sL https://nodejs.org/dist/latest-v8.x/node-v$NODE_BIN.tar.gz | tar -xz
mkdir -p bin
mv node-v$NODE_BIN/bin/node bin/node
rm -rf node-v$NODE_BIN

echo "Added \`bin/node\` as node-v$NODE_BIN"

You can run the script as postinstall and then user the binary created for run your microservice:

{
  "scripts": {
  "postinstall": "./scripts/download_node.sh",
  "start": "NODE_ENV=production ./bin/node ./node_modules/micro/bin/micro.js --host localhost --port $PORT index.js"
  }
}

Q: How Can I create my own Headless Chrome binary?

The idea behind the project is to ship a production-ready of last Chrome version available.

For do that, is necessary compile Chromium into a AWS Lambda compatible machine, normally a EC2.

We recommend the followings projects for get instructions to do that:

browserless – Chrome Headless API made easy.

License

aws-lambda-chrome © Kiko Beats, Released under the MIT License.
Authored and maintained by Kiko Beats with help from contributors.

kikobeats.com · GitHub Kiko Beats · Twitter @kikobeats

aws-lambda-chrome's People

Contributors

Stargazers

Watchers

Forkers

smoll casualuser chandantiwari

aws-lambda-chrome's Issues

Error: Protocol error (Target.createTarget): Target closed.

When trying to run with AWS Lambda I'm getting this error.
index.js is same as required by AWS Lambda with handler function and requiring my other file in index.js.
Here is code from that file:

// this one is same as example `get-browserless.js`
const getBrowserless = require('./get_browserless')

async function startTest() {
	try {
		const { text } = await getBrowserless();
		const url = 'https://google.com';
		const urlText = await text(url)
		console.log(urlText)
		return Promise.resolve();
	} catch (error) {
		console.log(error)
		return Promise.reject();
	}
}

const test = { startTest };
module.exports = test;

Provide a complete example of how to deploy a simple script to AWS Lambda

I've been using https://github.com/serverless/serverless and https://github.com/apex/up to deploy things to AWS Lambda and abstract away all the ugly manual config parts.

If you think it's useful, I could provide code examples of how to do this for a simple browserless script. Other things that are probably useful to demonstrate in a minimal way so users don't have to solve the exact same problems over and over again:

you need to babel-ify async code so it can run in Node v6.10 https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html
how to avoid running into the "package bigger than 250 meg" problem https://docs.aws.amazon.com/lambda/latest/dg/limits.html

This is something that https://github.com/sambaiz/puppeteer-lambda-starter-kit happens to implicitly document quite well

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on all branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet. We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please delete the greenkeeper/initial branch in this repository, and then remove and re-add this repository to the Greenkeeper App’s white list on Github. You'll find this list on your repo or organization’s settings page, under Installed GitHub Apps.

Few Questions

I'm debating over hosting a snapshot service on Lambda or Now.sh. I have questions regarding the use of this repository.

How many concurrent requests can this handle?
What happens when Chrome crashes?
What is the most solid configuration/usage you've found with the use of Chrome & Lambda?
Sites like "https://browserless.io" claim that it's best to create a browser each time you need a fresh instance to prevent issues, however doing that with this library takes roughly 7-11 seconds for a 540x720 screenshot... how do we speed this up?

I want to try to avoid crashes at all cost as well as have something that scales well.

WebGL support

sambaiz/puppeteer-lambda-starter-kit#8

Unzipped size must be smaller than 262144000 bytes

I'm trying to get this package working on Lambda.

Lambda now supports Node 8.1.0 so that's made it easier for me.

https://aws.amazon.com/blogs/compute/node-js-8-10-runtime-now-available-in-aws-lambda/

So I set the runtime on my machine to Node 8.1.0, also the runtime on Lambda.

Following your instructions to the letter, I think.

npm install puppeteer browserless --save
npm install @browserless/aws-lambda-chrome --save

Create the two files, then zip up my node modules directory, the js files and package.json.

On upload, I get the error message, partly I guess because I'm trying to upload two versions of Chromium, yours and Puppeteer's.

What am I doing wrong?

Cannot execute binary file

Hey! Thanks for this great plugin. I'm facing an issue where puppeteer fails to launch the headless_shell binary.

Here's my code snippet (using the serverless framework):

const puppeteer = require('puppeteer')
const getChromePath = require('@browserless/aws-lambda-chrome')({
  path: '/tmp'
})

module.exports.handler = async (event, context) => {
  const browser = await puppeteer.launch({
    executablePath: await getChromePath()
  })

  const page = await browser.newPage()
  await page.goto('https://example.com')
  console.log(await page.title())

  await browser.close()
}

The error I get is:

  Error --------------------------------------------------

  Failed to launch chrome!
/tmp/headless_shell: /tmp/headless_shell: cannot execute binary file


TROUBLESHOOTING: https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md

Any suggestions?

Use brotli compression

https://github.com/MayhemYDG/iltorb

Chromium revision is not downloaded

This is what i get when executing the lambda. But, this works on local machine using lambda-local package.

{
  "errorMessage": "Chromium revision is not downloaded. Run \"npm install\" or \"yarn install\"",
  "errorType": "Error",
  "stackTrace": [
    "assert (/var/task/node_modules/puppeteer/lib/helper.js:282:11)",
    "Function.launch (/var/task/node_modules/puppeteer/lib/Launcher.js:106:7)",
    "<anonymous>"
  ]
}

Build script is broke

I am trying to build my own binary. I launched the recommended instance type, downloaded the build script then executed it. The script errors out:

Error: client not configured; see 'gclient config'

I was able to resolve that issue using advice found here https://bugs.chromium.org/p/v8/issues/detail?id=3787.

gclient config https://chromium.googlesource.com/v8/v8

However, running the build script again led to:

gn.py: Could not find checkout in any parent of the current path.
This must be run inside a checkout.

which is caused by gn gen out/Headless.

Have you been able to run the build script lately?