Giter VIP home page Giter VIP logo

awslabs / sensitive-data-protection-on-aws Goto Github PK

View Code? Open in Web Editor NEW
100.0 14.0 9.0 46.74 MB

The Sensitive Data Protection on AWS solution allows enterprise customers to create data catalogs, discover, protect, and visualize sensitive data across multiple AWS accounts. The solution eliminates the need for manual tagging to track sensitive data such as Personal Identifiable Information (PII) and classified information.

License: Apache License 2.0

Shell 0.75% Python 41.53% TypeScript 52.71% Dockerfile 0.07% HTML 0.08% SCSS 1.26% JavaScript 3.60%
analytics aws gdpr glue-job pii-detection rds s3 security-audit sensitive-data serverless

sensitive-data-protection-on-aws's Introduction

English | 简体中文

Sensitive Data Protection Solution on AWS

Secure sensitive data across multiple AWS accounts, including PII.

Documentation · Changelog

Apache 2.0 License GitHub contributors


Introduction

The Sensitive Data Protection on AWS solution allows enterprise customers to create data catalogs, discover, protect, and visualize sensitive data across multiple AWS accounts. The solution eliminates the need for manual tagging to track sensitive data such as Personal Identifiable Information (PII) and classified information.

The solution provides an automated approach to data protection with a self-service web application. You can perform regular or on-demand sensitive data discovery jobs using your own data classification templates. Moreover, you can access metrics such as the total number of sensitive data entries stored in all your AWS accounts, which accounts contain the most sensitive data, and the data source where the sensitive data is located.

For more information about the solution, please refer to our website.

Summary Dashboard
PII Data Identifiers Data Catalog Management

Quick deployment

This project is an AWS Cloud Development Kit(CDK) project written in Typescript, if you want to use this solution without building the entire project, you can use the Amazon CloudFormation template to deploy the solution in 20 minutes, please follow the Implementation Guide to deploy the solution in your AWS account.

Architecture

The Solution uses AWS Glue service for data catalog acquisition in the monitored account(s) and invoking Glue Job for sensitive data PII detection. The distributed Glue job runs in each monitored account and the admin account contains centralized data catalog of data stores across AWS accounts.

architecture

  1. The Application Load Balancer distributes the solution's frontend web UI assets hosted in AWS Lambda.
  2. Identity provider for user authentication.
  3. The AWS Lambda function is packaged as Docker images and stored in the Amazon ECR (Elastic Container Registry).
  4. The backend Lambda function is a target for the Application Load Balancer.
  5. The backend Lambda function invokes AWS Step Functions in monitored accounts for sensitive data detection.
  6. In AWS Step Functions workflow, the AWS Glue Crawler runs to take inventory of the data sources and is stored in the Glue Database as metadata tables.
  7. The Step Functions send Amazon SQS messages to the detection job queue after the Glue job has run.
  8. Lambda function processs messages from Amazon SQS.
  9. The Amazon Athena query detection results and save to MySQL instance in Amazon RDS.

License

Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at

http://www.apache.org/licenses/

or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions and limitations under the License.

sensitive-data-protection-on-aws's People

Contributors

530051970 avatar amazon-auto avatar aws-cloudfront-extension-bot avatar chenhaiyun avatar dependabot[bot] avatar icykallen avatar ninglu avatar nowfox avatar o0oooo avatar rrxie avatar sussii avatar yanbasic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sensitive-data-protection-on-aws's Issues

ALB assigned with public nets instead of private

When the admin stack is deployed with internal Alb, it chooses public nets instead of private ones.

Reproduction Steps

  1. Add public and private nets as parameters to admin stack
  2. Choose AlbInternetFacing - No
  3. Open internal ALB -> network mappings
  4. See public subnets

What did you expect to happen?

In network mapping for ALB there should be only private nets

What actually happened?

Public nets are ssigned

Environment

The latest version of the Admin stack (Version 1.0.0-4e8a780)

Other

Fresh installation return errors on first login

❓ General Issue

The Question

Hello team,

We deployed the admin stack via Cloudformation, it successfully deployed all required resources.
When we try to login via the portal URL, we get random errors without any context which looks like this:
image

Portal lamda fucntion is OK
Portal configuration lambda is OK
But we found some errors in SDPS-Admin-APIAPIFunction719F975A-G3Vu0hXVstFp lambda which we guess leads to this UI errors:


2023-09-07 12:00:07,165 [ERROR] exception_handler.py exception_handler() L36   Traceback (most recent call last):  File "/opt/python/requests/models.py", line 971, in json    return complexjson.loads(self.text, **kwargs)  File "/var/runtime/simplejson/__init__.py", line 525, in loads    return _default_decoder.decode(s)  File "/var/runtime/simplejson/decoder.py", line 370, in decode    obj, end = self.raw_decode(s)  File "/var/runtime/simplejson/decoder.py", line 400, in raw_decode    return self.scan_once(s, idx=_w(s, idx).end())simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)During handling of the above exception, another exception occurred:Traceback (most recent call last):  File "/opt/python/starlette/middleware/errors.py", line 162, in __call__    await self.app(scope, receive, _send)  File "/opt/python/starlette/middleware/base.py", line 106, in __call__    response = await self.dispatch_func(request, call_next)  File "/var/task/main.py", line 87, in validate    validate_result = __validate_token(token, jwt_claims)  File "/var/task/main.py", line 119, in __validate_token    return __online_validate(token, jwt_claims)  File "/var/task/main.py", line 163, in __online_validate    json_response = requests.post(url, data, headers=headers).json()  File "/opt/python/requests/models.py", line 975, in json    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0) | 2023-09-07 12:00:07,165 [ERROR] exception_handler.py exception_handler() L36 Traceback (most recent call last): File "/opt/python/requests/models.py", line 971, in json return complexjson.loads(self.text, **kwargs) File "/var/runtime/simplejson/__init__.py", line 525, in loads return _default_decoder.decode(s) File "/var/runtime/simplejson/decoder.py", line 370, in decode obj, end = self.raw_decode(s) File "/var/runtime/simplejson/decoder.py", line 400, in raw_decode return self.scan_once(s, idx=_w(s, idx).end()) simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/python/starlette/middleware/errors.py", line 162, in __call__ await self.app(scope, receive, _send) File "/opt/python/starlette/middleware/base.py", line 106, in __call__ response = await self.dispatch_func(request, call_next) File "/var/task/main.py", line 87, in validate validate_result = __validate_token(token, jwt_claims) File "/var/task/main.py", line 119, in __validate_token return __online_validate(token, jwt_claims) File "/var/task/main.py", line 163, in __online_validate json_response = requests.post(url, data, headers=headers).json() File "/opt/python/requests/models.py", line 975, in json raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
-- | --

There are no explanations for which input data leads to this error.

We tried to add an AWS account via agent stack but we can't see it in UI.
There are no other errors in other components according to Cloudwatch logs.

Is it a known issue?
How can we troubleshoot this ?

Environment

AWS organization,
It stack deployed to delegated admin acc
Admin stack deployed to the same delegated admin acc
Agent stack deployed to another aws acc
We use latest version of stack

Other information

Enabled DEBUG for API lambda and it complains about TOKEN:

2023-09-07 13:07:27,690 [DEBUG] main.py validate() L86 token not in list
--

 


Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.