Comments (7)
I start looking into how configuration works. Then I can estimate the time I need to commit.
from invenio-s3.
After looking into how invenio-s3 and invenio-files-rest are actually implemented I'm coming to the conclusion that it's not that easy to implement. I see to targets to achieve:
- it should be possible to configure one S3 endpoint per defined s3 location
- there should be a default configuration for s3 locations without a configuration of it's own
The problem I see is: there is no obvious relation between a location and the files configuration at all. That's surely not a problem, if the location is just a directory anywhere in the file system, but to define a s3 endpoint you need at least the url of the s3 server and a secret. The storage factory creates the storage class just by the fileurl.
I think that should be refactored by adding at least the name of the location to have a key to distinguish the s3 configurations properties. An s3 endpoint url per location would the be configured by the property invenio_s3.config.S3_ENDPOINT_URL.location_name (with the fallback invenio_s3.config.S3_ENDPOINT_URL if there is just one s3 endpoint). The other configuration options would be similar.
from invenio-s3.
@ppanero @wgresshoff what about moving the endpoint to the location URI? (I probably thought about this at some point)
Having something like s3://myserver.com/b1
, the only open question I have would be the default base URL, i.e. if one uses AWS S3 you don't actually need to specify the URL of the server, it's already set internally by the boto3
library.
I don't really like the idea of adding a configuration variable to solve this because it'll add complexity into the location creation, which we want to avoid, plus I see this like the kind of thinkg that I would forget and then wonder for a day or two why my files are not in the right place π
from invenio-s3.
@egabancho @ppanero I need to clarify my idea: I would like to add the location name in the parameter list of the storage factory, so the storage knows which configuration to use. The location would not be changed. The base URL itself without credentials won't help (there could be different credentials for the same server URL with different storage prices).
from invenio-s3.
You are definitely right, URL without credentials ... not gonna work.
Probably you already discuss this with @ppanero but, could you put here an example of how this configuration would look like? I'd really help understand what you have in mind
from invenio-s3.
@egabancho we did not really sketch how it would look at that point. I thought of having something like for ES, where you can specify multiple hosts, each with its credentials.
However, I think we would end up in the same issue @wgresshoff is mentioning, and need to change the files factory.
from invenio-s3.
Sorry, I needed some time, but finally...
Ok, the example: if there are two locations defined, say names are amazon_aws and cephfs the configuration would look like this:
S3_ENDPOINT_URL.amazon_aws = https://amazon.com
S3_ENDPOINT_URL.cephfs = https://ceph.com
S3_ACCESS_KEY_ID.amazon_aws = xyz
S3_ACCESS_KEY_ID.cephfs = abc
S3_SECRET_ACCESS_KEY.amazon_aws = sdsdsdsds
S3_SECRET_ACCESS_KEY.cephfs = abcabdafah
And finally there might be some default configuration (which would surely lead to some nice errors if forgotten) as fallback:
S3_ENDPOINT_URL = https://default.com
S3_ACCESS_KEY_ID = ghz
S3_SECRET_ACCESS_KEY = lkhjsdafkjhfdskjh
So everywhere the config is consulted the location_name should be known. This leads to some more code in invenio-s3 but a function to read the configuration is rather simple to implement (and only needed in invenio-s3).
from invenio-s3.
Related Issues (13)
- File upload fails for big files
- docs: domain name in init.py HOT 1
- make new release HOT 5
- remove python 2 support HOT 4
- Update docs following new files rest release HOT 1
- invenio-s3 repeats the bucket name as path HOT 1
- Use checksum from storage server instead of calculating it always HOT 5
- global: migration to github-actions from travis
- Number of parts are not correctly calculated
- Typo in config variable `S3_ACCCESS_KEY_ID` `S3_ACCESS_KEY_ID` (too many Cs)
- Make the region name configurable too HOT 1
- Upload speed to S3 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from invenio-s3.