Giter VIP home page Giter VIP logo

lambda-pdftk-example's Introduction

AWS Lambda + PDFtk Example

This repository provides a working example of using PDFtk within AWS Lambda. AWS Lambda runs on Amazon Linux, which does not officially support PDFtk or GCJ, one of PDFtk's dependencies. This example works by including both a PDFtk binary and the libgcj shared library.

Run the Example

To run the example, first package up the project into a ZIP file by running:

./dist.sh

Then, simply upload this ZIP to AWS Lambda. When testing with the Lambda web interface, you should see the function succeed and output PDFtk's version and copyright information.

You can very easily expand on this boilerplate and use PDFtk in the way it was intended for - manipulating PDF files.

How it Works

AWS Lambda supports binary dependencies by allowing them to be included in uploaded ZIP files. However, because Amazon Linux does not support PDFtk or GCJ, PDFtk was built from source in CentOS, a close relative of Amazon Linux. I spun up a CentOS 6 machine in EC2 and followed the instructions on the PDFtk website to build PDFtk from source.

sudo yum install gcc gcc-java libgcj libgcj-devel gcc-c++

wget https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk-2.02-src.zip

unzip pdftk-2.02-src.zip

cd pdftk-2.02-dist/pdftk

make -f Makefile.Redhat

sudo make -f Makefile.Redhat install

Then I copied the resulting pdftk binary and /usr/lib64/libgcj.so.10 shared library into the bin/ directory of my Lambda project.

The entry point to the lambda function, index.js, alters the PATH and LD_LIBRARY_PATH environment variables to let the system know where to find the binary and the GCJ dependency.

Using PDFtk in Amazon Linux

It should be possible to use the PDFtk binary and GCJ shared library located in the bin/ directory of this file to run PDFtk in Amazon Linux on EC2. Simply copy them onto the machine and put them in the correct path, or call them directly:

LD_LIBRARY_PATH=/path/to/libgcj.so.10 /path/to/pdftk --version

lambda-pdftk-example's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lambda-pdftk-example's Issues

How to use/test on AWS Lambda?

I uploaded the zip file, but afterwards received the warning: "The deployment package of your Lambda function "pdftk" is too large to enable inline code editing. However, you can still invoke your function." *pdftk is what I named the lambda function

I decided to at least 'test' it by pressing the test button and going w/ their default event (since I thought it wouldn't matter what we passed to it), but that failed w/ the output:
{ "errorMessage": "2020-01-18T00:05:36.522Z e8312ad1-165e-4255-a5d9-6b5c50b374f8 Task timed out after 3.00 seconds" }

I thought the example was supposed to be a straightforward way to test a simple run of pdftk in lambda, but I don't know where I might've misunderstood/missed a step.

[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function'

my project's files & directories: https://i.stack.imgur.com/ZahqK.png

I have a Lambda Function, and the code I have uploaded to the Lambda Function is a zipped folder of my /Archive directory.

When creating Lambda Layer, do I upload a zipped PDFtk binary as 1 layer? and the libgcj as a 2nd layer? Or do I upload both of them in 1 layer?

Do I upload my lambdazip.sh in a Lambda Layer?

From what I understand, many of the people who run into this "[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function':" have issues because of their Lambda Handler.

My Lambda handler is: lambda_function.lambda_handler so this doesn't appear to be my issue.

Another common problem I've noticed on StackOverflow, appears to be with how people are compressing & zipping the files they upload to the Lambda Function.

Do I need to move my lambda_function.py? Sometimes this CloudWatch error occurs because the lambda_function.py is not in the ROOT directory.

Does my survey directory need to move?

I think the folders & directories I have here may be causing my issue.

Do I need to zip the directories individually?

Can I resolve this error by Zipping the entire project?

For more information, I also have a Lambda Layer for PDF Toolkit, called pyPDFtk in the codebase. In that Lambda layer is a zipped /bin with binaries inside.

If there is anything I can alter/change within my code or AWS configuration, please let me know, and I can return new CloudWatch error logs for you.

If you prefer to answer on StackOverflow: https://stackoverflow.com/questions/69902669/error-runtime-importmoduleerror-unable-to-import-module-lambda-function-no-m

Permission denied

Hello I get this error:

   /bin/sh: /var/task/bin/pdftk: Permission denied
   at ChildProcess.exithandler (child_process.js:294:12)
   at ChildProcess.emit (events.js:198:13)
   at maybeClose (internal/child_process.js:982:16)
   at Socket.stream.socket.on (internal/child_process.js:389:11)
   at Socket.emit (events.js:198:13)
   at Pipe._handle.close (net.js:607:12)
 cause:
  { Error: Command failed: pdftk './pdf1.pdf' './pdf2.pdf' cat output /tmp/tmp-808h0P0DOq5v1
  /bin/sh: /var/task/bin/pdftk: Permission denied

should I use chmod? does it matter if I have ubuntu?

Add Ubuntu's drm_fix patch

Ubuntu (and I think Debian too) add a drm_fix patch to their copy of pdftk. This makes it work on PDFs that have an owner password set to enable certain PDF features, but are not encrypted. I'm working on a project where PDFs come in from weird sources, so this is quite necessary.

You can see the patch here: drm_fix.txt

Or get it from Ubuntu here: http://packages.ubuntu.com/trusty/pdftk

How can I upload the the ZIP file to AWS Lambda ?

I didn't see any option to upload binaries to AWS Lambda. As per the readme

AWS Lambda supports binary dependencies by allowing them to be included in uploaded ZIP files.

Could you please provide the reference link or tutorial for this.

Command terminated by signal 11

I am getting error "Command terminated by signal 11 " while I have used binary version of file on my godaddy shared account...

Can you please help me to get solution of this ?

/var/task/bin/pdftk: Permission denied

Hey,

I got an error while trying to test the example function after uploaded it to Lambda from S3.
this is the error that I get:
{
"errorMessage": "Command failed: /bin/sh: /var/task/bin/pdftk: Permission denied\n",
"errorType": "Error",
"stackTrace": [
"",
"ChildProcess.exithandler (child_process.js:658:15)",
"ChildProcess.emit (events.js:98:17)",
"maybeClose (child_process.js:766:16)",
"Socket. (child_process.js:979:11)",
"Socket.emit (events.js:95:17)",
"Pipe.close (net.js:466:12)"
]
}

configurations :
IAM Role - lambda_basic_exectution :
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:::*"
}
]
}

I gave full permissions (chmod 777) to each file in the zip archive before uploading it to S3
What did I missed ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.