finraos / herd-mdl Goto Github PK
View Code? Open in Web Editor NEWHerd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.
License: Apache License 2.0
Herd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.
License: Apache License 2.0
As Herd Admin I want to run MDLT on any MDL environment to ensure that MDL stack is fully functional
Acceptance Criteria
Technical Notes
As a Herd OSS data publisher I want to upload a sample data file and make it available for users
See https://github.com/FINRAOS/herd/wiki/Publishing-Sample-Data for some details
Acceptance Criteria
As Herd-MDL User I want all endpoints to use a certificate from a trusted authority so I can reduce the likelihood of any challenges with end users connecting to endpoints with self-signed certificates
Pre-requisite - user specifies they want to create stack with certificate and authentication; user supplies appropriately wildcarded certificate from Trusted Authority; user supplies certificate private key
Acceptance Criteria
Note - MLiy team to verify in integrated demo environment
Describe the bug
When I execute a CFT, I get an s3 conflicing conditional operation error. This has started to happen recently.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A CloudFormation template to spin up w/o any error
Details and/or logs
I have spun this up in my account many many times. Lately it has changed to always get this error.
Additional context
Add any other context about the problem here.
As Herd PO I want to reduce AWS costs without making functional compromise
Later we might need to take additional cost-saving steps but initially let's see what we can do without compromising anything.
Acceptance Criteria
Technical notes:
As a Herd-MDL Administrator I want to view clean, organized logs in the CloudWatch console.
In #17 we introduced lambdas that clean up certain AWS resources when a stack is deleted. Each of these Lambdas writes to its own log groups which looks noisy in the CloudWatch console.
Acceptance Criteria
...during/after Authorization re-factor
Need to have one user that is read-only in both Herd and BDSQL
As a Herd-MDL OSS contributor I want to build an MDL stack from Herd-MDL code I forked and modified so I can dev/build/test contributions
NOTE - Only applies to Herd-MDL code (CFTs and scripts) -- does not need to work with customized Metastor and BDSQL code. These are handled in other stories
Acceptance Criteria
Technical Notes
Hi team,
We are trying install the Finra Herd MDL using "Basin install" yaml file (installMDL.yaml) as per the instructions mentioned in https://finraos.github.io/herd-mdl/docs.html .
But facing the issue mentioned in attached screen shot. We have Poweruser privileges too.
Any suggestions would be greatly appreciated. If it is not the right forum, the could you please let me know the right forum.
Thanks in advance.
As Metastore developer, I want to oibtain Metastore artifacts from Sonatype
Currently artifacts are only available in GitHub
Acceptance Criteria
As an MDL admin I want to perform an upgrade to Herd in an existing MDL stack
This is part of a series of stories that will introduce and incrementally build out the capability to maintain and manage MDL instances.
Acceptance Criteria
Upgrade capability exists that will replace existing Herd version with new version
Persistent state information in Herd is untouched - Herd registration data in RDS, Herd configuration data in RDS and app server configs
Herd smoke test runs successfully
Note - the following aspects are not included. These will be addressed in future stories:
Not integrated with Herd-MDL code
Not tested
Just experimental, working
Steps
Describe the bug
While following the steps for "Basic Install" I get the following error.
Embedded stack arn:aws:cloudformation:us-east-1:############:stack/herd-mdl-MdlStack-XOBVK3U2F2DE/########-####-####-####-############ was not successfully created: The following resource(s) failed to create: [PrerequisitesSecondaryStack].
I am using a personal account as root access.
To Reproduce
Steps to reproduce the behavior:
Embedded stack arn:aws:cloudformation:us-east-1:############:stack/herd-mdl-MdlStack-XOBVK3U2F2DE/########-####-####-####-############ was not successfully created: The following resource(s) failed to create: [PrerequisitesSecondaryStack].
Expected behavior
I expect the CloudFormation template to complete successfully with a 'CREATE_COMPLETE' message.
Details and/or logs
Additional context
Just realized that I changed the DeployComponent to Herd. I modified the steps above.
Observed in 1.2 stack
Steps to reproduce
Defective behavior
<?xml version="1.1" encoding="UTF-8" standalone="yes"?> <errorInformation> <statusCode>500</statusCode> <statusDescription>Internal Server Error</statusDescription> <message>Failed to delete keys/key versions with prefix "reg-demo/exchange/source/txt/demo-data/schm-v0/data-v0/transaction-date=2018-09-20/" from bucket "REDACTED". Reason: One or more objects could not be deleted (Service: null; Status Code: 200; Error Code: null; Request ID: 4B4BD720C0A1E854; S3 Extended Request ID: JqmMvj+wp4HpmWnXzMoUQbY3ymeEkGOCGZSWJQmxhZvv8uz9bwE5kadhIJyi7n4j4qx7silZ53o=)</message> <messageDetails> <message>One or more objects could not be deleted (Service: null; Status Code: 200; Error Code: null; Request ID: 4B4BD720C0A1E854; S3 Extended Request ID: JqmMvj+wp4HpmWnXzMoUQbY3ymeEkGOCGZSWJQmxhZvv8uz9bwE5kadhIJyi7n4j4qx7silZ53o=)</message> </messageDetails> </errorInformation>
Desired behavior
Observed in MDL 1.2 release
Steps to reproduce
Defective Behavior
Desired Behavior
As Herd Product Owner I want Herd-MDL to use the AWS ElasticSearch service for Herd indexed search filters
Currently Herd-MDL stands up EC2s, installs ElasticSearch on these EC2s, and sets up security groups that allow Herd to talk to ElasticSearch. Instead we want Herd-MDL to stand up an AWS ElasticSearch domain.
Acceptance Criteria
As Basic User I only want to access publicly available services and data
This is currently working but we need to add negative cases to testing
Acceptance Criteria
Observed in MDL:
Herd fails to launch EMR clusters in regions other than the default (us-east-1)
Steps to reproduce
Install MDL in us-west-2 (or any other region)
Defective Behavior
Metastor (nested) stack fails to create because cluster creation fails. The underlying issue is that while trying to launch an EMR cluster to process Herd objects it looks for subnets in the us-east-1 region which are not available.
Desired Behavior
EMR cluster-create 'action' looks for subnets in the region where the stack is deployed.
Describe the bug
It appears that in the mdl.yml file a duplicate line exists. When trying to open the YAML file in cloudformation designer I get the following error.
4/8/2019, 1:15:38 PM - Cannot render the template because of an error.: YAMLException: duplicated mapping key at line 308, column 119: ... Environment, /S3/URL/Shepherd]] ^
Lines 302 and 308 look to be identical. Once line 302 is removed the error no longer appears.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The designer should open w/o any error. Should also display the diagram.
Details and/or logs
Additional context
Using https://github.com/FINRAOS/herd-mdl/releases/tag/mdl-v1.4.0
As a Data Publisher I want to indicate the SME for a Data Entity so users know who to contact with questions
Acceptance Criteria
Technical notes
As a Herd-MDL admin, I want to be able to pick up a new Herd version
Currently this would involve building a whole stack. This story is to upgrade Herd in place.
Acceptance Criteria
Technical note
As OSS Herd-MDL user I want clean, understandable CFTs
These items were taken from the engineering wish list
Acceptance Criteria
CloudFormation installation does not complete when enabling 'EnableSSLAndAuth'. Error 500 occurs during configureAndStartHerd.sh
in the 'enableSSLAndAuth' section (line 182~189). The logs error output indicate Please update "herd.notification.user.namespace.authorization.change.message.definitions" configuration entry
which contradicts what is specified in the code configuration (ConfigurationValue.java) for that field "There is no default value which will cause no messages to be sent."
CloudFormation output - HerdEC2Stack
Timestamp | Logical ID | Status | Status Reason |
---|---|---|---|
2019-07-29 10:13:19 UTC-0600 | eval-MdlStack-1T46P4XH491ZZ-HerdEC2Stack-19E38V7YEU6OA | CREATE_FAILED | The following resource(s) failed to create: [HerdWaitCondition]. |
2019-07-29 10:13:18 UTC-0600 | HerdWaitCondition | CREATE_FAILED | WaitCondition timed out. Received 0 conditions when expecting 1 |
Herd codedeploy
[2019-07-29 16:05:20.869] [d-19OME41WN][stdout]07/29/2019 16:05:20 *** ERROR *** curl --request POST --header 'Content-Type: application/json' --data '
{
"userNamespaceAuthorizationKey": {
"userId": "mdl_user",
"namespace": "MDL"
},
"namespacePermissions": [
"READ",
"WRITE"
]
}
' https://AWSALBevalHerdeval-1761683017.us-east-1.elb.amazonaws.com/herd-app/rest/userNamespaceAuthorizations --insecure has failed with error 500
Herd application logs
Jul-29-2019 16:05:20.805 [ajp-nio-8009-exec-2] DEBUG org.finra.herd.ui.RequestLoggingFilter.logRequest userId=admin_user - HTTP Request [uri=/herd-app/rest/userNamespaceAuthorizations;method=POST;client=10.0.10.189;session=E1E4FB1B9FEEF0BCE3E0A27064AEE840;payload=
{
"userNamespaceAuthorizationKey": {
"userId": "mdl_user",
"namespace": "MDL"
},
"namespacePermissions": [
"READ",
"WRITE"
]
}
]
Jul-29-2019 16:05:20.843 [ajp-nio-8009-exec-2] ERROR finra.herd.service.helper.HerdErrorInformationExceptionHandler.logError userId=admin_user - A general error occurred.
java.lang.IllegalStateException: Notification message destination must be specified. Please update "herd.notification.user.namespace.authorization.change.message.definitions" configuration entry.
at org.finra.herd.service.helper.notification.AbstractNotificationMessageBuilder.buildNotificationMessages(AbstractNotificationMessageBuilder.java:146) ~[herd-service-0.81.0.jar:?]
at org.finra.herd.service.helper.notification.NotificationMessageManager.buildNotificationMessages(NotificationMessageManager.java:100) ~[herd-service-0.81.0.jar:?]
at org.finra.herd.service.impl.MessageNotificationEventServiceImpl.processUserNamespaceAuthorizationChangeNotificationEvent(MessageNotificationEventServiceImpl.java:98) ~[herd-service-0.81.0.jar:?]
at org.finra.herd.service.impl.MessageNotificationEventServiceImpl$$FastClassBySpringCGLIB$$bb1fb8f9.invoke(<generated>) ~[herd-service-0.81.0.jar:?]
a
This is for Demo users and MDLT functional tests.
Acceptance Criteria
As an MDL Developer I want to make sure all non-CFT resources are cleaned up when stack is deleted
By default the stack will tear down anything explicitly created in CFT but it will not clean up things that are created by other scripting that runs. This is okay if the resources created by the other scripting go away with CFT resources (eg stuff on EC2s) – but not okay if the resources are in a persistent location like Parameter Store or Credstash
Acceptance Criteria
Describe the bug
Running uploader jar on freshly created 1.4 no-auth stack. Uploader fails when trying to get temp credetials from STS. See exact error below
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Uploader succeeds, registers BData in Herd and places file in S3
Details and/or logs
Exact failure
Dec-22-2018 23:01:49.765 [main] WARN herd.tools.common.databridge.AutoRefreshCr edentialProvider.getAwsCredential - Error retrieving new credentials. Message: F ailed to get business object data upload credential, Status Code: 500, Status De scription: Internal Server Error, Response Message: 1 validation error detected: Value '{{UPLOAD_ARN}}' at 'roleArn' failed to satisfy constraint: Member must h ave length greater than or equal to 20 (Service: AWSSecurityTokenService; Status Code: 400; Error Code: ValidationError;
Hey there! I noticed some possible problems in some code in this repo. A quick summary of a few of them is below, but let me know if you're interested in seeing a full report or talking about cloud security in general.
severity: serious
filename: ./mdl/src/main/cft/mdlCreateIAMRoles.yml
line number(s): [58, 116]
resource(s):
IAM role should not allow * resource with PassRole action on its permissions policy
severity: warning
filename: ./mdl/src/main/cft/mdlCreateIAMRoles.yml
line number(s): [27, 58, 116]
resource(s):
IAM role should not allow * resource on its permissions policy
severity: warning
filename: ./mdl/src/main/cft/mdlCreateIAMRoles.yml
line number(s): [27, 58, 116]
resource(s):
Resource found with an explicit name, this disallows updates that require replacement of this resource
severity: warning
filename: ./mdl/src/main/cft/mdlHerdRds.yml
line number(s): [153]
resource(s):
Resource found with an explicit name, this disallows updates that require replacement of this resource
severity: warning
filename: ./mdl/src/main/cft/mdlCreateNsAuthSyncUtil.yml
line number(s): [80]
resource(s):
Resource found with an explicit name, this disallows updates that require replacement of this resource
severity: warning
filename: ./mdl/src/main/cft/mdlMetastor.yml
line number(s): [141]
resource(s):
Resource found with an explicit name, this disallows updates that require replacement of this resource
severity: warning
filename: ./mdl/src/main/cft/mdlCreateKeyPair.yml
line number(s): [58]
resource(s):
IAM role should not allow * resource on its permissions policy
Observed in 1.2
Old MySQL version for Metastore caused failure during stack build
Trying to bring up the HERD-MDL Stack via the cloud formation script.
The stack fails to launch root cause is -- BdsqlWaitCondition times out. It appears that it never gets a signal.
I can successfully launch:
PreRequisites, Herd, and MetaStore. When I attempt to launch BDSQL it appears to successfully create all the components as shown on the following screen shot.
I have traced this to line 330 of /herd-mdl/mdl/src/main/cft/mdlBdsql.yml where it appears that the handle is passed to the config step.
It's not clear what this is trying to achieve, however.
It is an argument to the script: /bootstrap/configurePresto.sh
Any suggestions?
As a Herd-MDL developer I want to use SecureString in CFT instead of using SSM from scripts
Currently all of our secrets have to be handled in scripts by using SSM services. Now we can create and retrieve secrets from directly from CFT. This simplifies the code significantly.
Acceptance Criteria
As a Herd-UI user I want to be able to link directly to a Herd-UI page so I can bookmark and/or send links.
Currently both scenarios listed in the acceptance criteria return a 404. We need them to work as documented here:
Acceptance Criteria
Technical Notes:
<IfModule mod_rewrite.c>
RewriteEngine on
# -- send bots/spiders/crawlers 404 page not found
RewriteCond %{HTTP_USER_AGENT} (bot|spider|crawler|search|find|walker) [NC]
RewriteRule .* - [R=404,L]
# -- HTTP to HTTPS
RewriteCond %{HTTP:X-Forwarded-Proto} ^http$
RewriteRule .* https://%{HTTP_HOST}%{REQUEST_URI} [R,L]
## custom rule for herdui
# Rewrite routes to index.html unless they are for specific files
# Don't rewrite files or directories
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^ - [L]
# If the requested resource doesn't exist, use index.html
RewriteRule ^ /index.html
</IfModule>
Hi Team,
I am Facing other issue, while creating the Web server with basic installation cloud-formation template. Please find the attached screenshot.
I have modified as per our earlier discussion, as mentioned in #84
I have set MetastorDBClass to - db.m5.large
Image Id - ami-0ad42f4f66f6c1cc9 - (ap-south-1 region)
Any suggestions please. Thanks in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.