Giter VIP home page Giter VIP logo

centralize-logs-lambda-firehose's Introduction

Centralize log collection with Kinesis Firehose using Lambda Extensions

Introduction

This pattern walks through an approach to centralize log collection for lambda function with Kinesis firehose using external extensions. The provided code sample shows how to get send logs directly to kinesis firehose without sending them to AWS CloudWatch service.

Note: This is a simple example extension to help you investigate an approach to centralize the log aggregation. This example code is not production ready. Use it with your own discretion after testing thoroughly.

This sample extension:

  • Subscribes to receive platform and function logs.
  • Runs with a main, and a helper goroutine: The main goroutine registers to ExtensionAPI and process its invoke and shutdown events. The helper goroutine:
    • starts a local HTTP server at the provided port (default 1234) that receives requests from Logs API with NextEvent method call
    • puts the logs in a synchronized queue (Producer) to be processed by the main goroutine (Consumer)
  • The main goroutine writes the received logs to AWS Kinesis firehose, which gets stored in AWS S3

Amazon Kinesis Data firehose

Amazon Kinesis Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores, and analytics services. It can capture, transform, and deliver streaming data to Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, generic HTTP endpoints, and service providers like Datadog, New Relic, MongoDB, and Splunk, read more about it here

Note: The code sample provided part of this pattern delivers logs from Kinesis firehose to Amazon S3

Lambda extensions

Lambda Extensions, a new way to easily integrate Lambda with your favorite monitoring, observability, security, and governance tools. Extensions are a new way for tools to integrate deeply into the Lambda environment. There is no complex installation or configuration, and this simplified experience makes it easier for you to use your preferred tools across your application portfolio today. You can use extensions for use-cases such as:

  • capturing diagnostic information before, during, and after function invocation
  • automatically instrumenting your code without needing code changes
  • fetching configuration settings or secrets before the function invocation
  • detecting and alerting on function activity through hardened security agents, which can run as separate processes from the function

read more about it here

Note: The code sample provided part of this pattern uses external extension to listen to log events from the lambda function

Need to centralize log collection

Having a centralized log collection mechanism using kinesis firehose provides the following benefits:

  • Helps to collect logs from different sources in one place. Even though the sample provided sends logs from Lambda, log routers like Fluentbit and Firelens can send logs directly to kinesis firehose from container orchestrators like EKS and ECS.
  • Define and standardize the transformations before the log gets delivered to downstream systems like S3, elastic search, redshift, etc
  • Provides a secure storage area for log data, before it gets written out to the disk. In the event of machine/application failure, we still have access to the logs emitted from the source machine/application

Architecture

AWS Services

  • AWS Lambda
  • AWS Lambda extension
  • AWS KinesisFirehose
  • AWS S3

High level architecture

Here is the high level view of all the components

architecture

Once deployed the overall flow looks like below:

  • On start-up, the extension subscribes to receive logs for Platform and Function events.
  • A local HTTP server is started inside the external extension which receives the logs.
  • The extension also takes care of buffering the recieved log events in a synchronized queue and writing it to AWS Kinesis Firehose via direct PUT records

Note: Firehose stream name gets specified as an environment variable (AWS_KINESIS_STREAM_NAME)

  • The lambda function won't be able to send any logs events to AWS CloudWatch service due to the following explict DENY policy:
Sid: CloudWatchLogsDeny
Effect: Deny
Action:
  - logs:CreateLogGroup
  - logs:CreateLogStream
  - logs:PutLogEvents
Resource: arn:aws:logs:*:*:*
  • The Kinesis Firehose stream configured part of this sample sends log directly to AWS S3 (gzip compressed).

Build and Deploy

AWS SAM template available part of the root directory can be used for deploying the sample lambda function with this extension

Pre-requistes

  • AWS SAM CLI needs to get installed, follow the link to learn how to install them

Build

Check out the code by running the following command:

mkdir kinesisfirehose-logs-extension-demo && cd kinesisfirehose-logs-extension-demo
git clone https://github.com/hariohmprasath/centralize-logs-lambda-firehose.git .

Run the following command from the root directory

sam build

Output

Building codeuri: /Users/xxx/CodeBase/aws-lambda-extensions/kinesisfirehose-logs-extension-demo/hello-world runtime: nodejs12.x metadata: {} functions: ['HelloWorldFunction']
Running NodejsNpmBuilder:NpmPack
Running NodejsNpmBuilder:CopyNpmrc
Running NodejsNpmBuilder:CopySource
Running NodejsNpmBuilder:NpmInstall
Running NodejsNpmBuilder:CleanUpNpmrc
Building layer 'KinesisFireHoseLogsApiExtensionLayer'
Running CustomMakeBuilder:CopySource
Running CustomMakeBuilder:MakeBuild
Current Artifacts Directory : /Users/xxx/CodeBase/aws-lambda-extensions/kinesisfirehose-logs-extension-demo/.aws-sam/build/KinesisFireHoseLogsApiExtensionLayer

Build Succeeded

Built Artifacts  : .aws-sam/build
Built Template   : .aws-sam/build/template.yaml

Commands you can use next
=========================
[*] Invoke Function: sam local invoke
[*] Deploy: sam deploy --guided

Deployment

Run the following command to deploy the sample lambda function with the extension

sam deploy --guided

The following parameters can be customized part of the deployment

Parameter Description Default
FirehoseStreamName Firehose stream name lambda-logs-direct-s3-no-cloudwatch
FirehoseS3Prefix The S3 Key prefix for Kinesis Firehose lambda-logs-direct-s3-no-cloudwatch
FirehoseCompressionFormat Compression format used by Kinesis Firehose, allowed value - UNCOMPRESSED, GZIP, Snappy GZIP
FirehoseBufferingInterval How long Firehose will wait before writing a new batch into S3 60
FirehoseBufferingSize Maximum batch size in MB 10

Note: We can either customize the parameters, or leave it as default to proceed with the deployment

Output

CloudFormation outputs from deployed stack
-------------------------------------------------------------------------------------------------------------------
Outputs
-------------------------------------------------------------------------------------------------------------------
Key                 KinesisFireHoseLogsApiExtensionLayer
Description         Kinesis Log emiter Lambda Extension Layer Version ARN
Value               arn:aws:lambda:us-east-1:xxx:layer:kinesisfirehose-logs-extension-demo:5

Key                 BucketName
Description         The bucket where data will be stored
Value               sam-app-deliverybucket-xxxx

Key                 KinesisFireHoseIamRole
Description         Kinesis firehose IAM role
Value               arn:aws:firehose:us-east-1:xxx:deliverystream/lambda-logs-direct-s3-no-cloudwatch

Key                 HelloWorldFunction
Description         First Lambda Function ARN
Value               arn:aws:lambda:us-east-1:xxx:function:kinesisfirehose-logs-extension-demo-function
-------------------------------------------------------------------------------------------------------------------

Testing

You can invoke the Lambda function using the following CLI command

aws lambda invoke \
    --function-name "<<function-name>>" \
    --payload '{"payload": "hello"}' /tmp/invoke-result \
    --cli-binary-format raw-in-base64-out \
    --log-type Tail

Note: Make sure to replace function-name with the actual lambda function name

The function should return "StatusCode": 200, with the below output

{
    "StatusCode": 200,
    "LogResult": "<<Encoded>>",
    "ExecutedVersion": "$LATEST"
}

In a few minutes after the successfully invocation of the lambda function, we should start seeing the log messages from the example extension written to an S3 bucket.

  • Login to AWS console:
    • Navigate to the S3 bucket mentioned under the parameter BucketName in the SAM output.

    • We can see the logs successly written to the S3 bucket, partitioned based on date in GZIP format. s3

    • Navigate to "/aws/lambda/${functionname}" log group inside AWS CloudWatch service.

    • We shouldn't see any logs created under this log group as we have denied access to write any logs from the lambda function. cloudwatch

Cleanup

Run the following command to delete the stack, use the correct stack names if you have changed them during sam deploy

aws cloudformation delete-stack --stack-name sam-app

Resources

Conclusion

This extension provides an approach to streamline and centralize log collection using Kinesis firehose.

centralize-logs-lambda-firehose's People

Contributors

hariohmprasath avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.