lendup / fs2-blobstore Goto Github PK
View Code? Open in Web Editor NEWMinimal, idiomatic, stream-based Scala interface for key/value store implementations
License: Apache License 2.0
Minimal, idiomatic, stream-based Scala interface for key/value store implementations
License: Apache License 2.0
This test will fail:
it should "extend a path with no key correctly" in {
val path = Path("some-bucket") / "key"
path must be(Path("some-bucket", "key", None, false, None))
}
with some-bucket//key was not equal to some-bucket/key
The extension code doesn't check for the empty initial key and you end up with a double '/' in the resulting path.
I'm happy to put together a PR to fix this. Either it can check for a blank key string when extending, or I could change Path.key
to be Option[String]
to make it more explicit.
My company is currently still stuck on 2.11, and are likely to skip 2.12 completely. I need access to the TransferManager
version of the S3 blobstore, but there is no release on central for 0.3.x. Can you please publish this version?
I'm willing to send this in as a PR. How do you want it? One travis build that does the current behavior + s3 tests?
I think the alternate approach might be to move the integration tests to src/e2e
so they can be invoked separately more easily, and have two travis builds, one for test and one for e2e - the downside to this approach is that it makes coverage collection more complex.
Even if it is a good practice to provide size when calling put, especially for S3Store since Amazon's S3 client would bufferall in memory before uploading data, S3Store.put should support paths with no size.
Sample code to reproduce the issue here.
Codecov has not been updated in the last few PRs, need to investigate.
S3Store invokes TransferManager#upload
, which says
* When uploading options from a stream, callers <b>must</b> supply the size of
* options in the stream through the content length field in the
* <code>ObjectMetadata</code> parameter.
* If no content length is specified for the input
* stream, then TransferManager will attempt to buffer all the stream
* contents in memory and upload the options as a traditional, single part
* upload. Because the entire stream contents must be buffered in memory,
* this can be very expensive, and should be avoided whenever possible.
When you use too much, you get a java.lang.OutOfMemoryError: Java heap space
It is possible to do multipart uploads to use fixed memory; for reference, this is what Alpakka does for akka-streams:
Use the low-level API when you [..] do not know the size of the upload data in advance
This will help downstream folks avoid sbt eviction errors
The SftpStore is a bit clunky at the moment:
The apply method takes a F[ChannelSftp]
, but closes more resources than it acquires - it acquires a ChannelSftp
, but closes the session as well.
I think closing the session is fine, but we should take a F[Session]
then and just manage the channels internally. As it stands now it's impossible to perform concurrent operations on the store, since one channel can handle a single write at a time, and the store only works on one channel. AND if you try to make multiple stores for multiple channels on the client side, the store closes the session.
Maybe make something like:
ChannelSftp
put
acquire a channel from the pool and use itSince it's blocking IO, making the pool unbounded makes sense.
Any thoughts on this?
It would be really helpful for making sure that updates are safe if you had a changelog in the project. Either with github release tags or a file in the repo.
Although there is an explicit 5GB limit on put
, the s3 docs don't mention anything about a limit for downloads. I haven't tested this out, but its possible that there is a size limit on get
as well.
We mitigated the size limit issue by using TransferManager. We should look into using the TransferManager.download(...)
method when calling get
, and whether that is even necessary.
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html#download-com.amazonaws.services.s3.model.GetObjectRequest-java.io.File-
https://aws.amazon.com/blogs/developer/aws-sdk-for-java-2-0-developer-preview/
The new aws sdk supports CompletableFuture apis and a truly non-blocking backend. Would be great to get support for that eventually here
I can't find the the 0.3.x
releases of the project.
Tried with:
com.lendup.fs2-blobstore" %% "s3" % "0.3.0" // "0.3.+" // "0.3.0-SNAPSHOT"
No luck with any of them..
I browsed the maven repository and found there is no 0.3.0
version yet. However I could find an SNAPSHOT in sonatype is this intended? do I need to add the sonatype snapshots repository to use the 0.3.0
version?
I am really needing it, because I am having troubles uploading some big files to S3 - and the latest release states this is now possible.
Thanks in advance for the help.
We published the first artifact from the new repo to Maven central yesterday. Can we archive this repository and point to the new one?
https://github.com/fs2-blobstore/fs2-blobstore
https://search.maven.org/search?q=g:com.github.fs2-blobstore
On fs2 1.1, cats-effect 2.0, etc
Milestone release for now, since the dependencies are, but it would be good to get a newer cross-built one up just to allow cross compiling
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.