Giter VIP home page Giter VIP logo

cross-account-amazon-dynamodb-replication's Introduction

Cross Account Amazon DynamoDB Replication

This repository accompanies the Cross Account Amazon DynamoDB Replication blog post. It contains two AWS Serverless Application Model (AWS SAM) templates.

  1. First template deploys an AWS Glue job used for loading the data from the source DynamoDB table to the target DynamoDB table. If you are using the native DynamoDB export feature, do not use this template. Only use this template if the source DynamoDB table size is less than 140GB and you are using AWS Glue job for export and import.
  2. The second template deploys one AWS Lambda function that reads from source DynamoDB stream and replicates the changes to the target Amazon DyanmoDB table in a different AWS account.
├── README.MD <-- This instructions file

├── InitialLoad <-- The SAM template for Initial Load using Glue for import and export

├── ChangeDataCapture <-- The SAM template for Change Data Capture (CDC)

├── InitialMigrationWithNativeExport <-- Sample Glue code to help you get started. Bash script to change the owner of the objects in the target S3 bucket.

General Requirements

  • AWS CLI already configured with Administrator permission
  • Install SAM CLI if you do not have it.
  • Source and target tables created in DynamoDB
  • Target IAM role created with permissions to write to target DynamoDB table
  • For Change Data Capture, DynamoDB streams should be enabled on source table

Installation Instructions

Clone the repo onto your local development machine using git clone <repo url>.

Initial Load

  1. From the command line, change directory to SAM template directory for Initial Load

cd InitialLoad

  1. Run the below commands to deploy the template

sam build

sam deploy --guided

Follow the prompts in the deploy process to set the stack name, AWS Region and other parameters.

Initial Load Parameter Details

  • TargetDynamoDBTable: Target DynamoDB Table name
  • TargetAccountNumber: Target AWS Account Number
  • TargetRoleName: Target IAM Role name to be assumed by the Glue job
  • TargetRegion: The region for the target DynamoDB table
  • SourceDynamoDBTable: Source DynamoDB Table name
  • WorkerType: The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, or G.2X.
  • NumberOfWorkers: The number of workers of a defined workerType that are allocated when a job runs.
  • JobName: Name of the Glue Job

Change Data Capture (CDC)

  1. From the command line, change directory to SAM template directory for Initial Load

cd ChangeDataCapture

  1. Run the below commands to deploy the template

sam build

sam deploy --guided

Follow the prompts in the deploy process to set the stack name, AWS Region and other parameters.

CDC Parameter Details

  • TargetDynamoDBTable: Target DynamoDB Table name
  • TargetAccountNumber: Target AWS Account Number
  • TargetRoleName: Target IAM Role name to be assumed by Lambda function
  • TargetRegion: The region for the target DynamoDB table
  • MaximumRecordAgeInSeconds: The maximum age (in seconds) of a record in the stream that Lambda sends to your function.
  • SourceTableStreamARN: Source DynamoDB table stream ARN

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

cross-account-amazon-dynamodb-replication's People

Contributors

ahmedszamzam avatar amazon-auto avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cross-account-amazon-dynamodb-replication's Issues

EventSourceMapping for ChangeDataCapture could not be created in stack

Hey, thanks for the nice replication guide and the SAM templates you prepared.
In the ChangeDataCapture step I just have the issue that the EventSourceMapping could not be created in my CloudFormation stack.

I am constantly getting an error like:

Resource handler returned message: "Invalid     
request provided: Cannot access stream          
arn:aws:kinesis:eu-                             
west-1:*****:stream/*****. Please ensure the role can perform   
the GetRecords, GetShardIterator,               
DescribeStream, ListShards, and ListStreams     
Actions on your stream in IAM. (Service: Lambda, Status Code: 400, Request ID: *****, Extended  
Request ID: null)" (RequestToken: *****,           
HandlerErrorCode: InvalidRequest)               
The following resource(s) failed to create:     
[ReplayFromStreamFnDDBEvent]. Rollback          
requested by user. 

I used the template as it came from this repository.

DynamoDB tables replication blocked with lambda out of memory

We build CDC solution on AWS Account.
For some frequently updated DDB tables , we find that the target dynamoDB table has no data imported. We checked the lambda logs . The logs show as follow
lambda-error

As we can see, the lambda default 128M memory are used up. we update lambda memory size by aws cli and the replication recovered.

aws lambda update-function-configuration \ --function-name my-function \ --memory-size 256

Solutions Suggestion:
1.Expose lambda Memory size as parameter in sam template.

2.furthor more, It will be better if we add sns notification with lambda error logs ,So that people can receive alarm mail in time.

ChangeDataCapture base64 encodes binary values

The dynamodb stream is emitting it's changes in json string format, with binary values being base64 encoded. However when performing the putitem command through boto, these strings are assumed to be un-encoded and go through an additional base64 encoding.

I was able to resolve this by transforming the binary objects into their bytes representations before sending them through boto

        if event_name == 'REMOVE':
            keys_to_delete = record['dynamodb']['Keys']

            # Look for binary values b64 decode them back into bytes
            for key in keys_to_delete:
                for valType in keys_to_delete[key]:
                    if valType == 'B':
                        keys_to_delete[key][valType] = base64.b64decode(keys_to_delete[key][valType])

            dynamodb.delete_item(TableName=target_ddb_name,Key=record['dynamodb']['Keys'])
        else:
            #print("Putting item")
            item_to_put = record['dynamodb']['NewImage']

            # Look for binary values b64 decode them back into bytes 
            for property in item_to_put:
                for valType in item_to_put[property]:
                    if valType == 'B':
                        item_to_put[property][valType] = base64.b64decode(item_to_put[property][valType])
            dynamodb.put_item(TableName=target_ddb_name,Item=item_to_put)

What's next step after running `sam deploy --guided`

Hi,

I'm a newbie here and first time to use this cross-account-amazon-dynamodb-replication.

I was able to follow the README instructions and successfully ran the following commands for Initial Load on my local linux machine:

sam build
sam deploy --guided

I assumed I deployed the template successfully.

I was wondering what's next step if I wanted to replicate data from source DynamoDB table to the target table.
Can you please advise?

Missing functionality for setting different aws partitions

Hello,

First of all thanks for creating this great sample! It is super helpful and saved us a lot of development efforts.

I noticed that currently the sample doesn't allow setting aws partition as an override parameter. I am based on China and most of our resources are located within aws-cn instead of aws.

I was able to manually update the template file as well as the lambda function to make our case work, but I could see how adding the ability to override partition could benefit future users, especially considering we have at least 3 partitions for AWS now: aws, aws-cn, aws-us-gov, and that list might grow in the future. (Similar to "TargeRoleName", "TargetRegion" maybe we can have "TargetPartition")

Thanks a lot and thanks again for this awesome sample!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.