Giter VIP home page Giter VIP logo

aws-s3-component's People

Contributors

denyshld avatar dvd-z avatar emptyinfinity avatar if0s avatar jhorbulyk avatar kirill-levitskiy avatar olegosh avatar phachem avatar shulkaolka avatar stas-fomenko avatar uaarsen avatar zubairov avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aws-s3-component's Issues

Region for Bucket needs to be defined

Following changes to AWS documented here: https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro

Buckets created in Regions launched after March 20, 2019 are not reachable via the https://bucket.s3.amazonaws.com naming scheme

This BUG has been identified by a client in the US-East-1 region, this appears to have to call the API using a special case: "If your bucket is in the US East (N. Virginia) Region, you must use the http://s3.amazonaws.com/bucket endpoint."

I am unsure from the component if the URL we are calling is the "virtual-hosted–style URL" or the "path-style URL".

This needs to be investigated and determined if there is a fault or amendment needed in the component.

Improve error handling of Write action

STR:

  1. create flow: step1: simple trigger
  2. AWS S3: Write action: input valid Bucket Name and folder

AR:
Currently, Write action throws error when no attachment found from the previous step:
Error! Cannot convert undefined or null to object

ER:
User-friendly error is thrown, like: no attachment found from the previous step

aws-s3. Update component to new sailor version / use new logger functionality / dockerize

DoD
Development:

  • use new logger
  • component should be dockerised, add: "buildType":"docker" line into component.json
  • sailor version is changed from -.-.- to -.-.-
  • CHANGELOG.md is updated
    QA:
  • Smoke test is finished
  • Regression testing is finished (optional)
  • Support is asked to deploy to the production stage (app.elastic.io) and/or another required stage
  • Component is actually deployed to the production stage (app.elastic.io) and/or another required stage
  • Smoke after deploying to the production stage

Add a Trigger: Get New and Updated S3 Objects (aka Files) Polling With Object Moving

OIH Guideline

Not defined, modification of polling by timestamp case

Config Fields

  • Bucket (Dynamic Dropdown: All S3 Buckets that the configured credential can write)
  • Processed Folder Name (string)

Behavior

Code should:

  1. Learn the keys of all S3 Buckets that are not in the folder with name defined by the Processed Folder Name. (i.e. The key is does not start with /<processedFolderName>/)
  2. For each key, download the S3 Object and write it to Steward and then emit a message (one per Object) with the Steward URL, the S3 key and the S3 Metadata for the object.
  3. Move the Object into the folder. (The filename should be the same but /<processedFolderName>/ should be added to the begining of the file name.) In S3, there is no native move operation so this must be performed by doing a copy operation and then a delete operation on the original object.

Output data

  • Steward URL
  • S3 Key
  • S3 Metadata

Gotchas

  • A paging mechanism may be needed for large numbers of changed objects.
  • If the trigger fails partway through polling, it should resume at the point where it failed.

Not in scope (Limitations that can be done in future iterations)

  • Support for objects that are so large, they can not fit in the memory of the component's docker container.
  • Objects that are larger than the max size allowed by Steward
  • Object compression (with zip, gzip, etc)
  • Object encryption

Add a Trigger: Get New and Updated S3 Objects (aka Files) Polling By Timestamp

OIH Guideline

Config Fields

  • Start Time
  • End Time
  • Bucket (Dynamic Dropdown: All S3 Buckets that the configured credential can read)

Behavior

Code should:

  1. Learn the keys of all S3 Buckets with a last modified in the poll time
  2. For each key, download the S3 Object and write it to Steward and then emit a message (one per Object) with the Steward URL, the S3 key and the S3 Metadata for the object.

Output data

  • Steward URL
  • S3 Key
  • S3 Metadata

Gotchas

  • A paging mechanism may be needed for large numbers of changed objects.
  • If the trigger fails partway through polling, it should resume at the point where it failed.

Not in scope (Limitations that can be done in future iterations)

  • Support for objects that are so large, they can not fit in the memory of the component's docker container.
  • Objects that are larger than the max size allowed by Steward
  • Object compression (with zip, gzip, etc)
  • Object encryption

Don't create attachments that are too big

An error should be thrown before trying to create an attachment, if the attachment would exceed the maximum allowed attachment size. The maximum allowed attachment size should be read as an environment variable.

Before writing any files from S3 server to Steward, the size of the file should be checked. If it is above the threshold, an error should be emitted and no data transfered to Steward.

Invalid respose on non-deleted file in Delete file action

Description: currently action returns input of filename field, according to OIH Descriptions of standardized actions or triggers it should return {} in a case when an object was not deleted:

Preconditions: use creds from Bitwarden.

STR:

  1. Create flow: step1 - simple trigger
  2. step2 - AWS S3: Delete file action
    input:
    Bucket Name and Folder: "lloyds-dev" (valid input)
    filename: "wefxwfcecf" (invalid input)
  3. Retrieve sample

AR:
sample retrieved
output:
{ "filename": "wefxwfcecf" }

ER: {}

Add an action: Upsert S3 Object (aka File) By Key (aka ID/Filename)

OIH Guideline

Config Fields

None Required

Input Metadata

  • Should Be Dynamic
  • One input is the bucket name (enum, string, required). This input should be an enum where the values are dynamically generated based on the buckets that can be written to given the AWS access key in the credentials.
  • One input is the key name (string, required)
  • One input is the Steward URL to upload (string, required) Note: It is not possible to access attachments from the component immediately prior to another component and attachment URLs can not be selected using the point and click link. This is something that will be inconvienent at first but we will be able to over come it as the mapper is improved.

Behavior

Some throwaway code that shows how to do this in JS
Code should:

  1. Read the data from the Steward URL.
  2. Write the data to S3 using the bucket + key provided. If the key already exists, allow AWS to overwrite the object.

Output data

Emit an object where the body of the object contains all of the Amazon headers in the resonse

Gotchas

Not in scope (Limitations that can be done in future iterations)

  • Setting Object Metadata in S3
  • Support for objects that are so large, they can not fit in the memory of the component's docker container.
  • Object more that 5 GB in size (so that they can always be uploaded in a single operation)
  • Support for Storage Classes
  • Support for Object Tagging
  • Object compression (with zip, gzip, etc)
  • Object encryption

Add an action: Lookup S3 Object (aka File) (at most one) By Bucket + Key (aka ID/Filename)

OIH Guideline: Lookup Object (at most 1) by ID

Config Fields

  • Allow ID to be omitted (dropdown/checkbox: yes/no)
  • Allow zero results (dropdown/checkbox: yes/no)

Input Metadata

  • Should Be Dynamic
  • One input is the bucket name (enum, string). This input should be an enum where the values are dynamically generated based on the buckets that can be read from given the AWS access key in the credentials.
  • One input is the key name (string)

Behavior

Code should:

  • Read the S3 Object from S3 that may exist at the bucket + key name provided and write it to Steward.
  • Handle the behaviors defined in Allow ID to be omitted and Allow zero results

Output data

  • Steward URL
  • S3 Key
  • S3 Metadata

Gotchas

  • Ensure that encoded keys are encoded properly.

Not in scope (Limitations that can be done in future iterations)

  • Support for objects that are so large, they can not fit in the memory of the component's docker container.
  • Object more that 5 GB in size (so that they can always be uploaded in a single operation)
  • Object compression (with zip, gzip, etc)
  • Object encryption

Verify that the AWS S3 component can handle files of 1GB with default memory config

Other Feature Request

Description

We should check (on a staging environment) if the component can handle files of up to 1GB in size without the need to increase the allocated component memory beyond 250 MB.

Definition of Done

One of:

  • The component can manipulate files of 1 GB without needing to alter the component code or the available memory (only change the max file size env var.)
  • We make the needed code changes so that the component can manipulate files of 1 GB without needing to increase the available component memory
  • We identify a reason why files of 1 GB can not be manipulated without increasing the component memory.

Rename configuration fields in Lookup from table action

Description:
Currently action Lookup from table has config fields with incorrect names, the should be renamed:
From this table -> From this column
To this table -> To this column

image

Where to rename:

  • component.json
  • Readme.md

Revitalize this component

  • Make sure that this component adheres to the common definition of done.
  • Make sure that passwords in credentials are valid.
  • Use the latest versions of dependencies.
  • Mark actions that aren't sensible as depricated.
  • Include relevant information in the Readme
  • Start component completeness matrix
  • Make sure S3 Credentials are in BitWarden

Action: Get filenames returns only first 1000 file names

Description: Action: Get file names returns only first 1000 file names, so if in specified bucked more then 1000 files action would never retrieve other then first 1000 files without additional logic in flow

Precondition:
Creds in bitwarden - AWS PS Key
Use CrossFTP to open service and check files

STR:

  1. Create flow: Simple trigger
  2. Step2: aws-s3-component: a:Get filenames
    Input:
    Default Bucket Name: lloyds-dev
  3. Start flow

AR:
Get filenames returns only first 1000 file names

ER:

  1. Action should provide the possibility to choose Emit type:
  • Emit 1000
  • Emit All
  • In case if selected Emit 1000 and bucket contains 2000 files: 1st flow run - emitted 1000 filenames, 2nd flow run - emitted 1000 filenames, 3rd flow run - no emit (should be checked in SDK if it is possible to implement)
    • In case if selected Emit All and bucket contains 2000 files: all filenames are retrieved during the first flow run

NOTE1:
According to SDK investigation ER may be changed according to SDK capabilities

NOTE2:
https://stackoverflow.com/questions/12853476/amazon-s3-returns-only-1000-entries-for-one-bucket-and-all-for-another-bucket-u

Revitalize component unit tests

Internal Issue/Enhancement

Description

Currently there is only 1 unit test, all actions and trigger not covered

Definition of Done

Unit tests for all actions and trigger

Improve error handling of Get filenames action

Description: no correct error message in case of invalid input into bucketName field:

Preconditions: find creds in Bitwarden

STR:

  1. Create flow: step1: SImple trigger
  2. step2: AWS S3: Get filenames action
    Input:
    {
    "bucketName": "/Invalid_bucket_name_and_folder"
    }
  3. Retrieve sample

AR:
Error! Component did not send any data during timeout.

ER: something like - Invalid Bucket Name and folder has been specified

Component doesn't work at the current version of platform

The component can be successfully deployed to the eio-platform, but the current version of the eio-platform doesn't support CSVWriteHeader type of field in an input configuration.

component.json:

        "csv": {
          "viewClass": "CSVWriteHeader",
          "label": "CSV Header",
          "placeholder": "Copy and paste here first (header) line of your CSV file here",
          "required": true
        }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.