Giter VIP home page Giter VIP logo

subhamay-bhattacharyya / 0001-tarius-py-tf Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 617 KB

AWS Serverless Real Time Data Load to DynamoDB using Python Lambda and S3 Event Source Mapping and creating the stack using HashiCorp Terraform and Language as Python.

HCL 90.90% Python 9.10%
aws-cloudwatch aws-cloudwatch-alerts aws-cloudwatch-logs aws-dynamodb aws-kms aws-lambda-python aws-s3-bucket aws-sns-subscriptions aws-sns-topic aws-sqs s3-event-notification terraform

0001-tarius-py-tf's Introduction

         

Project Tarius: AWS Serverless Real Time Data Load to DynamoDB.

AWS Serverless Real Time Data Load to DynamoDB using Python Lambda and S3 Event Source Mapping and creating the stack using AWS CloudFormation and Language as Python.

Description

The user / producer uploads a csv source file to the landing zone S3 bucket. A Lambda function is triggered using S3 event notification and loads it to a DynamoDB table. The entire stack is created using HashiCorp Terraform. The S3 Bucket, SNS Topic and the DynamoDB table are encrypted using Customer Managed KMS Key. A Cross Stack is used to create the Custom Resource for creating the S3 folder and a Root stack is used to create the remaining services. The Root Stacks launches the Nested Stacks to create the S3 Bucket, SNS Topic, DynamoDB table and CloudWatch Alarms.

Project Tarius - Design Diagram

Services Used

mindmap
  root((1 -  HashiCorp Terraform ))
    2
        AWS DynamoDB
    3
        AWS Lambda 
    4
        AWS IAM 
    5   
        AWS CloudWatch
    6   
        AWS Key Management Service
    7
        AWS SNS
    8
        AWS SQS (DLQ)

Getting Started

  • This repository is configured to deploy the stack in Development, Staging and Production AWS Accounts. To use the pipeline you need to have three AWS Accounts created in an AWS Org under a Management Account (which is the best practice). The Org structure will be as follows:
Root
├─ Management
├─ Development
├─ Test
└─ Production
  • Create KMS Key in each of the AWS Accounts which will be used to encrypt the resources.

  • Create an OpenID Connect Identity Provider

  • Create an IAM Role for OIDC and use the sample Trust Policy in each of the three AWS accounts

{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::<Account Id>:oidc-provider/token.actions.githubusercontent.com"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "token.actions.githubusercontent.com:sub": [
                        "repo:<GitHub User>/<GitHub Repository>:ref:refs/head/main",
                    ]
                }
            }
        }
    ]
}
  • Create an IAM Policy to allow CloudFormation access and attach it to the OIDC Role, using the sample policy document:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowS3Access",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:CreateBucket",
                "s3:PutObjectTagging",
                "s3:PutBucketPublicAccessBlock",
                "s3:PutBucketAcl",
                "s3:DeleteObject",
                "s3:DeleteBucket",
                "s3:ListBucket",
                "s3:PutBucketTagging",
                "s3:GetBucketTagging",
                "s3:GetBucketPolicy",
                "s3:GetBucketAcl",
                "s3:GetObjectAcl",
                "s3:GetObjectVersionAcl",
                "s3:PutObjectAcl",
                "s3:PutObjectVersionAcl",
                "s3:GetBucketCORS",
                "s3:PutBucketCORS",
                "s3:GetBucketWebsite",
                "s3:GetBucketVersioning",
                "s3:GetAccelerateConfiguration",
                "s3:GetBucketRequestPayment",
                "s3:GetBucketLogging",
                "s3:GetLifecycleConfiguration",
                "s3:GetReplicationConfiguration",
                "s3:GetEncryptionConfiguration",
                "s3:GetBucketObjectLockConfiguration",
                "s3:GetBucketOwnershipControls",
                "s3:GetObjectTagging",
                "s3:GetBucketNotification",
                "s3:PutEncryptionConfiguration",
                "s3:PutBucketOwnershipControls",
                "s3:PutBucketNotification",
                "s3:ListBucketVersions",
                "s3:PutObjectTagging",
                "s3:PutBucketVersioning",
                "s3:PutLifecycleConfiguration",
                "s3:DeleteObjectVersion",
                "s3:PutBucketPolicy",
                "s3:DeleteBucketPolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "IAMAccess",
            "Effect": "Allow",
            "Action": [
                "iam:GetRole",
                "iam:CreateRole",
                "iam:DeleteRolePolicy",
                "iam:PutRolePolicy",
                "iam:DeleteRole",
                "iam:PassRole",
                "iam:TagRole",
                "iam:CreatePolicy",
                "iam:ListRolePolicies",
                "iam:GetPolicy",
                "iam:ListAttachedRolePolicies",
                "iam:GetPolicyVersion",
                "iam:ListInstanceProfilesForRole",
                "iam:AttachRolePolicy",
                "iam:DetachRolePolicy",
                "iam:ListPolicyVersions",
                "iam:DeletePolicy",
                "iam:UntagRole",
                "iam:GetRolePolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "LambdaCreateAccess",
            "Effect": "Allow",
            "Action": [
                "lambda:GetFunction",
                "lambda:DeleteFunction",
                "lambda:CreateFunction",
                "lambda:TagResource",
                "lambda:InvokeFunction",
                "lambda:PutFunctionConcurrency",
                "lambda:CreateFunction",
                "lambda:ListVersionsByFunction",
                "lambda:GetFunctionCodeSigningConfig",
                "lambda:PutFunctionEventInvokeConfig",
                "lambda:AddPermission",
                "lambda:GetFunctionEventInvokeConfig",
                "lambda:GetPolicy",
                "lambda:DeleteFunctionEventInvokeConfig",
                "lambda:RemovePermission",
                "lambda:ListTags",
                "lambda:UpdateFunctionConfiguration"
            ],
            "Resource": "*"
        },
        {
            "Sid": "SQSCreateAccess",
            "Effect": "Allow",
            "Action": [
                "sqs:GetQueueAttributes",
                "sqs:ListQueueTags",
                "sqs:CreateQueue",
                "sqs:TagQueue",
                "sqs:SetQueueAttributes",
                "sqs:SendMessage",
                "sqs:DeleteQueue"
            ],
            "Resource": "*"
        },
        {
            "Sid": "SNSCreateAccess",
            "Effect": "Allow",
            "Action": [
                "SNS:GetTopicAttributes",
                "SNS:ListTagsForResource",
                "SNS:GetSubscriptionAttributes",
                "SNS:CreateTopic",
                "SNS:TagResource",
                "SNS:SetTopicAttributes",
                "SNS:DeleteTopic",
                "SNS:Subscribe",
                "SNS:Unsubscribe"
            ],
            "Resource": "*"
        },
        {
            "Sid": "DynamoDBAccess",
            "Effect": "Allow",
            "Action": [
                "dynamodb:DescribeTable",
                "dynamodb:DescribeContinuousBackups",
                "dynamodb:DescribeTimeToLive",
                "dynamodb:ListTagsOfResource",
                "dynamodb:CreateTable",
                "dynamodb:TagResource",
                "dynamodb:DeleteTable",
                "dynamodb:PutItem",
                "dynamodb:GetItem",
                "dynamodb:DeleteItem"
            ],
            "Resource": "*"
        },
        {
            "Sid": "KMSKeyAccess",
            "Effect": "Allow",
            "Action": [
                "kms:Encrypt",
                "kms:Decrypt",
                "kms:DescribeKey"
            ],
            "Resource": [
				"arn:aws:kms:<AWS region>:<AWS Account Id>:key/<Customer Managed KMS Key Id>",
				"arn:aws:kms:<AWS region>:<AWS Account Id>:key/<AWS Managed S3 KMS Key Id>"
            ]
        },
        {
            "Sid": "CloudWatchAccess",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricAlarm",
                "cloudwatch:DescribeAlarms",
                "cloudwatch:ListTagsForResource",
                "cloudwatch:DeleteAlarms",
                "cloudwatch:TagResource"
            ],
            "Resource": "*"
        },
        {
            "Sid": "StateMachineAccess",
            "Effect": "Allow",
            "Action": [
                "states:CreateStateMachine",
                "states:TagResource",
                "states:DescribeStateMachine",
                "states:DeleteStateMachine"
            ],
            "Resource": "*"
        },
        {
            "Sid": "CWLogGroupAccess",
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:TagResource",
                "logs:PutRetentionPolicy",
                "logs:DescribeLogGroups",
                "logs:DeleteLogGroup",
                "logs:ListTagsLogGroup",
                "logs:TagLogGroup"
            ],
            "Resource": "*"
        },
        {
            "Sid": "EventBridgeAccess",
            "Effect": "Allow",
            "Action": [
                "events:TagResource",
                "events:CreateEventBus",
                "events:DeleteEventBus",
                "events:PutRule",
                "events:DescribeRule",
                "events:PutTargets",
                "events:RemoveTargets",
                "events:DeleteRule"
            ],
            "Resource": "*"
        },
        {
            "Sid": "SSMParamAccess",
            "Effect": "Allow",
            "Action": [
                "ssm:GetParameters"
            ],
            "Resource": "*"
        },
        {
            "Sid": "EC2Access",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeImages",
                "ec2:CreateInternetGateway",
                "ec2:CreateVpc",
                "ec2:CreateSubnet",
                "ec2:CreateTags",
                "ec2:CreateRouteTable",
                "ec2:CreateRoute",
                "ec2:CreateSecurityGroup",
                "ec2:DescribeInternetGateways",
                "ec2:DescribeVpcs",
                "ec2:DescribeAccountAttributes",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSubnets",
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeSecurityGroups",
                "ec2:DeleteVpc",
                "ec2:DeleteSubnet",
                "ec2:DeleteRouteTable",
                "ec2:DeleteRoute",
                "ec2:DeleteSecurityGroup",
                "ec2:DeleteInternetGateway",
                "ec2:ModifyVpcAttribute",
                "ec2:AttachInternetGateway",
                "ec2:DetachInternetGateway",
                "ec2:AssociateRouteTable",
                "ec2:ModifySubnetAttribute",
                "ec2:DisassociateRouteTable",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:RunInstances",
                "ec2:TerminateInstances",
                "ec2:DescribeInstances"
            ],
            "Resource": "*"
        },
        {
            "Sid": "Route53Access",
            "Effect": "Allow",
            "Action": [
                "route53:GetHostedZone",
                "route53:GetChange",
                "route53:ChangeResourceRecordSets",
                "route53:ListResourceRecordSets",
                "route53:ListHostedZones"
            ],
            "Resource": "*"
        }
    ]
}

Installing

  • Clone the repository.

  • Create a S3 bucket to used a code repository.

  • Update the terraform.tfvars files for devlopment, test and production

    • devl.terraform.tfvars
    • test.terraform.tfvars
    • prod.terraform.tfvars
  • Create three repository environments in GitHub (devl, test, prod)

  • Create the following GitHub repository Secrets:

Secret Name Secret Value
AWS_REGION us-east-1
DEVL_AWS_KMS_KEY_ARN arn:aws:kms:<AWS Region>:<Development Account Id>:key/<KMS Key Id in Development>
TEST_AWS_KMS_KEY_ARN arn:aws:kms:<AWS Region>:<Test Account Id>:key/<KMS Key Id in Test>
PROD_AWS_KMS_KEY_ARN arn:aws:kms:<AWS Region>:<Production Account Id>:key/<KMS Key Id in Production>
DEVL_AWS_ROLE_ARN arn:aws:iam::<Development Account Id>:role/<OIDC IAM Role Name>
TEST_AWS_ROLE_ARN arn:aws:iam::<Test Account Id>:role/<OIDC IAM Role Name>
PROD_AWS_ROLE_ARN arn:aws:iam::<Production Account Id>:role/<OIDC IAM Role Name>
DEVL_AWS_TF_STATE_BUCKET_NAME <S3 Bucket to store Terraform State in Development>
TEST_AWS_TF_STATE_BUCKET_NAME <3 Bucket to store Terraform State in Test>
PROD_AWS_TF_STATE_BUCKET_NAME <3 Bucket to store Terraform State in Production>
  • Create a DyanmoDB table in each of the Devl, Test and Prod environment with the name as terraform-state-locking and partition key as LockID and type String

Executing the CI/CD Pipeline

  • Create Create a feature branch and push the code.
  • The CI/CD pipeline will create a build and then will deploy the stack to devlopment.
  • Once the Stage and Prod deployment are approved (If you have configured with protection rule ) the stack will be reployed in the respective environments

Executing program

  • Upload the sample products.csv file to the landing zone folder of the S3 bucket either using S3 console or CLI
aws s3 cp products.csv <s3 bucket uri>/raw-data/ --profile <Profile Name>
  • Alternatively, the data file can be uploaded to the S3 bucket from AWS Console as well and verify the data in the DynamoDB table.

  • To test the SQS Dead Letter Queue, put an exception in the handler at the begging of the lambda_handler function

    now = datetime.now()
    current_time = now.strftime("%H:%M:%S")
    raise Exception(f"{current_time} :: Something bad happened.") 

and asynchronously invoke the lambda using the follwing command:

aws lambda invoke --invocation-type Event --function-name tarius-process-data-devl-us-east-1 --region us-east-1 --payload '{ "key1": "value1","key2":"value2" }' --cli-binary-format raw-in-base64-out response.json --profile <Profile Name>

After 2 failed attempts if the lambda fails for the third time, then the event is pushed into the dead letter queue.

  • To test the CloudWatch Throttling Alarm, upload the products.csv file to the S3 bucket multiple times in quick succession.
aws s3 cp products.csv <s3 bucket uri>/raw-data/ --profile <Profile Name>

Help

📧 Subhamay Bhattacharyya - [[email protected]]

Authors

Contributors names and contact info

Subhamay Bhattacharyya - [[email protected]]

Version History

  • 0.1
    • Initial Release

License

This project is licensed under Subhamay Bhattacharyya. All Rights Reserved.

Acknowledgments

0001-tarius-py-tf's People

Contributors

bsubhamay avatar

0001-tarius-py-tf's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.