xebia-functional / fetch Goto Github PK

View Code? Open in Web Editor NEW

494.0 52.0 49.0 4.33 MB

Simple & Efficient data access for Scala and Scala.js

Home Page: https://xebia-functional.github.io/fetch/

License: Apache License 2.0

Scala 86.05% HTML 2.89% JavaScript 2.64% SCSS 8.42%

scala scala-js cats monix data data-fetching parallelism concurrency sequencing monads

fetch's People

Contributors

Stargazers

Watchers

fetch's Issues

Improve `DataSource#fetch` signature

To receive a NonEmptyList, or maybe a NonEmptySet to better encode the invariants in the types.

Clarify in the docs that the cache is used throughout the whole execution of a fetch

Spotted in the gitter channel

The big thing for me was that it wasn't clear that the cache could be reused. It talks about the state monad and the cache being immutable which caused me to make some bad assumptions, even after diving into the code.

This should be more explicit and clear to not confuse people.

The current published documentation is out of sync and does not match the API as it stands in the latest release. It also points to an old cats version where several combinators and imports seem to have changed.

Provide `MonadError[M, Throwable]` instances for `Id` and `Eval`

Pass state to the data sources in a type-safe way

Currently, DataSource#fetch looks like this:

trait DataSource[Identity, Result] {
  def fetch(ids: NonEmptyList[Identity]): Query[Map[Identity, Result]]
}

The fetch method of a data source just receives a non empty list of identities. If we want to inyect some state to our data sources (for example an HTTP client to the data sources that make HTTP calls, a connection pool to the data sources that query a database, and so on) we must use the mechanisms that Scala gives us (implicits et al) since it's not directly supported by the library.

It may make sense to support passing state in a type-safe way to the data sources and provide the concrete values when running a Fetch, much like Haxl does. Not sure about how it'd look like yet, but when running a fetch we'd have to provide an additional value with the inyected state. Can the type system make sure that we are providing the state for every data source used inside a fetch? Should this be supported by the libary?

Give users the ability to configure maximum batch sizes

Provide examples of usage with other libraries in the documentation

It'd be interesting to write a few tutorials about using Fetch with libraries for reading data (from databases, HTTP services and beyond) that fit well with Fetch. A few come to mind, feel free to add more:

Doobie for DB access (transacting to Eval makes it really easy to integrate)
GitHub4s for accesing the GitHub API

Updated to Monix 2.2.3

PR already in sbt catalyst extras 47degrees/sbt-org-policies#13

Update to latest cats and monix

Now that we have cats 0.7.2 and monix final 2.0.0 released. Please consider upgrade the dependencies and publish a stable version. I tried it myself but got stuck on all the tailRecM stuff on the new cats Monad.

Scalaz integration

FetchMonadError[Task] and other implicits
Applicative and Monad instances for Fetch
Applicative instance for Task with a non-sequential ap

Make possible to inject a prepopulated cache to `Fetch.run`

SOE with deep stack.

A user has privately reported the following exception when running fetch over a large workflow.
The issue seems to be in the inspection step.

 java.lang.StackOverflowError
                at scala.Tuple2.productElement(Tuple2.scala:20)
                at scala.util.hashing.MurmurHash3.productHash(MurmurHash3.scala:64)
                at scala.util.hashing.MurmurHash3$.productHash(MurmurHash3.scala:211)
                at scala.runtime.ScalaRunTime$._hashCode(ScalaRunTime.scala:168)
                at scala.Tuple2.hashCode(Tuple2.scala:20)
                at scala.runtime.ScalaRunTime$.hash(ScalaRunTime.scala:206)
                at scala.collection.immutable.HashMap.elemHashCode(HashMap.scala:80)
                at scala.collection.immutable.HashMap.computeHash(HashMap.scala:89)
                at scala.collection.immutable.HashMap.get(HashMap.scala:54)
                at fetch.InMemoryCache.get(cache.scala:41)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:380)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:377)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:51)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:385)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:377)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:51)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:52)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:386)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:377)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:51)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:52)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:386)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:377)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:51)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:52)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:386)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:377)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:51)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:52)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:386)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:377)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:51)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:52)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:386)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:377)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:51)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:52)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:386)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:377)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:51)
                at cats.free.FreeTopExt$.modify(freeinspect.scala:52)
                at fetch.FetchInterpreters$$anon$4.apply(interpreters.scala:386)

Add Peter to contributors list

Add License text and license headers

Apache Public License v2.

We should use the SBT plugin for headers so we ensure every file in the project has the copyright notice.

Create PDF documentation

PDF documentation would be useful when there's no internet connection.

I've created some scripts which work well enough for cats, dogs and fetch, as shown below:
https://github.com/frgomes/debian-bin/blob/master/bash_30pdf.sh
https://github.com/frgomes/debian-bin/blob/master/bash_30httrack.sh
https://github.com/frgomes/debian-bin/blob/master/bash_31makepdf_cats.sh
https://github.com/frgomes/debian-bin/blob/master/bash_31makepdf_dogs.sh
https://github.com/frgomes/debian-bin/blob/master/bash_31makepdf_fetch.sh

I'm not suggesting you guys adopt these scripts.
However, anyone willing to do that may eventually find some pointers here.

Release new stable version?

It's been 72 commits since the last one.

Parallel Joins

After the amazing work by @peterneyens in #84, we've acquired a tech debt that we might consider to implement, based on the conversation in gitter:
https://gitter.im/typelevel/cats?at=5828ac28e097df7575a76077

Adelbert suggested to implement it with Free#fold.

Publish version for Scala 2.12

Don't require passing an Option to an asynchronous query callback

Since we can infer from the invoked callback/errback whether the fetch succeeded or not, we shouldn't require the callback type to be Option[Identity] => Unit but Identity => Unit.

Support the same Scala versions as cats

Identities with optional results

The current semantics of Fetch are that identities must have a result. If a result is missing, the fetch execution short-circuits and fails. Maybe we should support the notion of optional fetches, much like Clump does with the .optional method.

Make sure that the splitted batches are run concurrently

Right now, when a data source configures a maximum batch size and a request to such data source in batch is splitted in multiple batches, the queries are run sequentially. We want to perform all the batch requests to the data source at the same time, not sequentially.

Scaladoc for API reference

Contributing documentation

Improve `DataSource.fetchMulti` signature so implementors don't need to coerce

Setup Continuous Integration

@fedefernandez could you activate the project in Travis-CI please?

Acknowledgments

I'm not sure I'm correct, but this lib seems to be inspired by similar projects like Clump and Stitch. Could you please add an "Acknowledgments" section to the documentation mentioning it?

BTW, the generic approach for any monad and the typeclasses for sources are very nice! 👏

Thanks! :)

Improve and document the environment information gathering

We need to store accurate information about the execution plan of a fetch. Which operations where performed (fetch one, fetch many, concurrent fetch), whether the data was served from the cache or not, the rounds of execution (steps in the compuation that read data) performed and so on. We do this to some extent now but is not very useful in its current state. We should document what you can do with the environment and improve the docs about diagnosing fetch failures, which an improved reporting will make easier. Will also ease the implementation of #11.

Don't depend on the cache contents for simplifying joined trees

Ensure that the interpreter is stack-safe

Migrate group id com.47deg

We need to migrate new releases to the group id com.47deg

Make README.md a mirror of the documentation

The current README just points to the docs with a link but users already on Github could careless about jumping extra hoops. Also the file won't be indexed by Scaladex or crawlers that use the README as the doc defacto standard. Ideally post tut processing the file copied inside the jekyll site would also replace the README.md at the root and will auto generate an index to be placed a top the file to jump to sections.

Make `Query#async` accept a `Either[FetchError, A] => Unit`

To make the callback fn a bit more "ergonomic". Is the same type as the one used to model callbacks in freestyle-async or fs2.

Provide facilities for visualizing fetch executions

Track test coverage

Documentation microsite

Make the applicative and Cartesian instances for Fetch be automatically concurrent

Don't cast when we can use the type system better

Due to my lack of familiarity with Scala's type system I've used casts in a few places. I'm not sure which ones we can avoid but I'd be desirable to use the type system better instead of casting and losing type safety to a certain degree.

Use Simulacrum for implementing typeclasses and their syntax

See Simulacrum.

More flexible error handling

Currently, for running a fetch to a target monad M[_], we need an instance of MonadError[M, Throwable]. Should we leave the error type open to user-defined types instead of hard-coding Throwable?

Enforce users to declare a name for every DataSource

Since it turns out that there is no way to get a unique name with the current class name approach, we should require DataSource implementations to define the name method from now on.

Add automatic formatting to the project

Update to cats 0.9.0

would be nice because it is currently blocking us.

License

Support data sources that can only be queried asynchronously

Since currently data sources assume you can return a computation that synchronously gives you the result, the usefulness of the library is very limited in Scala.js. It would be desirable to support both synchronous and asynchronous data sources, maybe using monix-eval's Task type instead of Eval in data source's fetch methods? Task is more general than Eval, although we'd lose the ability to run fetches synchronously.

xebia-functional / fetch Goto Github PK

fetch's People

Contributors

Stargazers

Watchers

Forkers

fetch's Issues

Recommend Projects

Recommend Topics

Recommend Org