elasticio / aws-s3-component Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 7.0 882 KB

Amazon (AWS) S3 component for elastic.io

License: Other

JavaScript 99.79% Shell 0.21%

aws-s3-component's People

Contributors

Watchers

Forkers

flowground phachem mlb5000 t4runsingh andreas-fintricity

aws-s3-component's Issues

New features - review and merge them

There were implemented new reusable features compliant with OIH. They should be reviewed and merged into master branch.
PR: #12

Region for Bucket needs to be defined

Following changes to AWS documented here: https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro

Buckets created in Regions launched after March 20, 2019 are not reachable via the https://bucket.s3.amazonaws.com naming scheme

This BUG has been identified by a client in the US-East-1 region, this appears to have to call the API using a special case: "If your bucket is in the US East (N. Virginia) Region, you must use the http://s3.amazonaws.com/bucket endpoint."

I am unsure from the component if the URL we are calling is the "virtual-hosted–style URL" or the "path-style URL".

This needs to be investigated and determined if there is a fault or amendment needed in the component.

Improve error handling of Write action

STR:

create flow: step1: simple trigger
AWS S3: Write action: input valid Bucket Name and folder

AR:
Currently, Write action throws error when no attachment found from the previous step:
Error! Cannot convert undefined or null to object

ER:
User-friendly error is thrown, like: no attachment found from the previous step

aws-s3. Update component to new sailor version / use new logger functionality / dockerize

DoD
Development:

use new logger
component should be dockerised, add: "buildType":"docker" line into component.json
sailor version is changed from -.-.- to -.-.-
CHANGELOG.md is updated
QA:
Smoke test is finished
Regression testing is finished (optional)
Support is asked to deploy to the production stage (app.elastic.io) and/or another required stage
Component is actually deployed to the production stage (app.elastic.io) and/or another required stage
Smoke after deploying to the production stage

Create `Delete file` action

Implement functionality that will allow a user to

rename, and
delete

Files on S2 storage.

Add a Trigger: Get New and Updated S3 Objects (aka Files) Polling With Object Moving

OIH Guideline

Not defined, modification of polling by timestamp case

Config Fields

Bucket (Dynamic Dropdown: All S3 Buckets that the configured credential can write)
Processed Folder Name (string)

Behavior

Code should:

Learn the keys of all S3 Buckets that are not in the folder with name defined by the Processed Folder Name. (i.e. The key is does not start with /<processedFolderName>/)
For each key, download the S3 Object and write it to Steward and then emit a message (one per Object) with the Steward URL, the S3 key and the S3 Metadata for the object.
Move the Object into the folder. (The filename should be the same but /<processedFolderName>/ should be added to the begining of the file name.) In S3, there is no native move operation so this must be performed by doing a copy operation and then a delete operation on the original object.

Output data

Steward URL
S3 Key
S3 Metadata

Gotchas

A paging mechanism may be needed for large numbers of changed objects.
If the trigger fails partway through polling, it should resume at the point where it failed.

Not in scope (Limitations that can be done in future iterations)

Support for objects that are so large, they can not fit in the memory of the component's docker container.
Objects that are larger than the max size allowed by Steward
Object compression (with zip, gzip, etc)
Object encryption

Add a Trigger: Get New and Updated S3 Objects (aka Files) Polling By Timestamp

OIH Guideline

Config Fields

Start Time
End Time
Bucket (Dynamic Dropdown: All S3 Buckets that the configured credential can read)

Behavior

Code should:

Learn the keys of all S3 Buckets with a last modified in the poll time
For each key, download the S3 Object and write it to Steward and then emit a message (one per Object) with the Steward URL, the S3 key and the S3 Metadata for the object.

Output data

Steward URL
S3 Key
S3 Metadata

Gotchas

A paging mechanism may be needed for large numbers of changed objects.
If the trigger fails partway through polling, it should resume at the point where it failed.

Not in scope (Limitations that can be done in future iterations)

Support for objects that are so large, they can not fit in the memory of the component's docker container.
Objects that are larger than the max size allowed by Steward
Object compression (with zip, gzip, etc)
Object encryption

Don't create attachments that are too big

An error should be thrown before trying to create an attachment, if the attachment would exceed the maximum allowed attachment size. The maximum allowed attachment size should be read as an environment variable.

Before writing any files from S3 server to Steward, the size of the file should be checked. If it is above the threshold, an error should be emitted and no data transfered to Steward.

Invalid respose on non-deleted file in Delete file action

Description: currently action returns input of filename field, according to OIH Descriptions of standardized actions or triggers it should return {} in a case when an object was not deleted:

Preconditions: use creds from Bitwarden.

STR:

Create flow: step1 - simple trigger
step2 - AWS S3: Delete file action
input:
Bucket Name and Folder: "lloyds-dev" (valid input)
filename: "wefxwfcecf" (invalid input)
Retrieve sample

AR:
sample retrieved
output:
{ "filename": "wefxwfcecf" }

ER: {}

Create `Rename file` action

Think about adding CSV parsing into S3 Component (other formats to follow)

Discussion to follow.

Add an action: Upsert S3 Object (aka File) By Key (aka ID/Filename)

OIH Guideline

Config Fields

None Required

Input Metadata

Should Be Dynamic
One input is the bucket name (enum, string, required). This input should be an enum where the values are dynamically generated based on the buckets that can be written to given the AWS access key in the credentials.
One input is the key name (string, required)
One input is the Steward URL to upload (string, required) Note: It is not possible to access attachments from the component immediately prior to another component and attachment URLs can not be selected using the point and click link. This is something that will be inconvienent at first but we will be able to over come it as the mapper is improved.

Behavior

Some throwaway code that shows how to do this in JS
Code should:

Read the data from the Steward URL.
Write the data to S3 using the bucket + key provided. If the key already exists, allow AWS to overwrite the object.

Output data

Emit an object where the body of the object contains all of the Amazon headers in the resonse

Gotchas

There are limits on the allowable chars in keys Perhaps we should throw an exception when we encounter a non-allowed key.

Not in scope (Limitations that can be done in future iterations)

Setting Object Metadata in S3
Support for objects that are so large, they can not fit in the memory of the component's docker container.
Object more that 5 GB in size (so that they can always be uploaded in a single operation)
Support for Storage Classes
Support for Object Tagging
Object compression (with zip, gzip, etc)
Object encryption

Update Sailor to 2.6.7

PR #46

Add an action: Lookup S3 Object (aka File) (at most one) By Bucket + Key (aka ID/Filename)

OIH Guideline: Lookup Object (at most 1) by ID

Config Fields

Allow ID to be omitted (dropdown/checkbox: yes/no)
Allow zero results (dropdown/checkbox: yes/no)

Input Metadata

Should Be Dynamic
One input is the bucket name (enum, string). This input should be an enum where the values are dynamically generated based on the buckets that can be read from given the AWS access key in the credentials.
One input is the key name (string)

Behavior

Code should:

Read the S3 Object from S3 that may exist at the bucket + key name provided and write it to Steward.
Handle the behaviors defined in Allow ID to be omitted and Allow zero results

Output data

Steward URL
S3 Key
S3 Metadata

Gotchas

Ensure that encoded keys are encoded properly.

Not in scope (Limitations that can be done in future iterations)

Support for objects that are so large, they can not fit in the memory of the component's docker container.
Object more that 5 GB in size (so that they can always be uploaded in a single operation)
Object compression (with zip, gzip, etc)
Object encryption

Verify that the AWS S3 component can handle files of 1GB with default memory config

Other Feature Request

Description

We should check (on a staging environment) if the component can handle files of up to 1GB in size without the need to increase the allocated component memory beyond 250 MB.

Definition of Done

One of:

The component can manipulate files of 1 GB without needing to alter the component code or the available memory (only change the max file size env var.)
We make the needed code changes so that the component can manipulate files of 1 GB without needing to increase the available component memory
We identify a reason why files of 1 GB can not be manipulated without increasing the component memory.

Rename configuration fields in Lookup from table action

Description:
Currently action Lookup from table has config fields with incorrect names, the should be renamed:
From this table -> From this column
To this table -> To this column

Where to rename:

component.json
Readme.md

Revitalize this component

Make sure that this component adheres to the common definition of done.
Make sure that passwords in credentials are valid.
Use the latest versions of dependencies.
Mark actions that aren't sensible as depricated.
Include relevant information in the Readme
Start component completeness matrix
Make sure S3 Credentials are in BitWarden

Action: Get filenames returns only first 1000 file names

Description: Action: Get file names returns only first 1000 file names, so if in specified bucked more then 1000 files action would never retrieve other then first 1000 files without additional logic in flow

Precondition:
Creds in bitwarden - AWS PS Key
Use CrossFTP to open service and check files

STR:

Create flow: Simple trigger
Step2: aws-s3-component: a:Get filenames
Input:
Default Bucket Name: lloyds-dev
Start flow

AR:
Get filenames returns only first 1000 file names

ER:

Action should provide the possibility to choose Emit type:

Emit 1000
Emit All

In case if selected Emit 1000 and bucket contains 2000 files: 1st flow run - emitted 1000 filenames, 2nd flow run - emitted 1000 filenames, 3rd flow run - no emit (should be checked in SDK if it is possible to implement)
- In case if selected Emit All and bucket contains 2000 files: all filenames are retrieved during the first flow run

NOTE1:
According to SDK investigation ER may be changed according to SDK capabilities

NOTE2:
https://stackoverflow.com/questions/12853476/amazon-s3-returns-only-1000-entries-for-one-bucket-and-all-for-another-bucket-u

Component should be able to handle attachment information via the message body

Revitalize component unit tests

Internal Issue/Enhancement

Description

Currently there is only 1 unit test, all actions and trigger not covered

Definition of Done

Unit tests for all actions and trigger

Images

Improve error handling of Get filenames action

Description: no correct error message in case of invalid input into bucketName field:

Preconditions: find creds in Bitwarden

STR:

Create flow: step1: SImple trigger
step2: AWS S3: Get filenames action
Input:
{
"bucketName": "/Invalid_bucket_name_and_folder"
}
Retrieve sample

AR:
Error! Component did not send any data during timeout.

ER: something like - Invalid Bucket Name and folder has been specified

Please update the sailor version

@zubairov the elastic.io sailor component is outdated. The component cannot be deployed into elastic.io platform.

Component doesn't work at the current version of platform

The component can be successfully deployed to the eio-platform, but the current version of the eio-platform doesn't support CSVWriteHeader type of field in an input configuration.

component.json:

        "csv": {
          "viewClass": "CSVWriteHeader",
          "label": "CSV Header",
          "placeholder": "Copy and paste here first (header) line of your CSV file here",
          "required": true
        }

elasticio / aws-s3-component Goto Github PK

aws-s3-component's People

Contributors

Watchers

Forkers

aws-s3-component's Issues

OIH Guideline

Config Fields

Behavior

Output data

Gotchas

Not in scope (Limitations that can be done in future iterations)

Config Fields

Behavior

Output data

Gotchas

Not in scope (Limitations that can be done in future iterations)

Config Fields

Input Metadata

Behavior

Output data

Gotchas

Not in scope (Limitations that can be done in future iterations)

Config Fields

Input Metadata

Behavior

Output data

Gotchas

Not in scope (Limitations that can be done in future iterations)

Other Feature Request

Description

Definition of Done

Internal Issue/Enhancement

Description

Definition of Done

Recommend Projects

Recommend Topics

Recommend Org