xsc / claro Goto Github PK

View Code? Open in Web Editor NEW

142.0 142.0 7.0 514 KB

Powerful Data Access for Clojure

Home Page: http://xsc.github.io/claro/

License: MIT License

Clojure 100.00%

dataloader datasources infinite-trees

claro's Introduction

claro's People

Contributors

Stargazers

Watchers

Forkers

iivalchev christoph-frick zolazhou mathieuravaux clojure-land davidalphafox jofernmorais

claro's Issues

Generic Engine Wrappers

The proposals in #3 and #4 both hint at postprocessing steps for engine. While an engine does implement IFn it will no longer implement the Engine protocol if a function wrapper is put around it.

So, we might want to create wrap-pre-processor and wrap-post-processor functions that produce valid engines again.

Generalise Batch Count Limit as Cost Limit

Instead of limiting the number of batches that can be resolved during one run we could introduce a per-resolvable-batch cost function and base a limit on it. This addresses the problem that batch count (or tree depth) has only limited usefulness due to every batch being treated equally, no matter the batch size, transformation complexity and expected latency.

For example, resolvables that just wrap other resolvables without side-effects do not contribute in any significant way to overall resolution time, so they could be treated as zero-cost operations.

I thus think we should introduce a Cost protocol, by default returning 1 for BatchedResolvable batches and the respective batch size for non-batched resolvables. This, by default, limits the number of I/O operations. Additionally, I'd add a PureResolvable marker protocol that gets assigned a cost value of 0.

This allows users to protect against too complex queries in a more fine-grained way. It might also be possible to use this information for static analysis purposes, assuming an additional schema layer on top of claro.

Partial Resolvable Results

Sometimes, we already know part of the resolution result without doing any work, e.g. IDs or some child resolvables:

(defrecord Person [id]
  data/Resolvable
  ...
  data/Transform
  (transform [_ result]
    (assoc result :friends (->Friends id))))

Since :friends only depends on the already knownid we don't really need to do any Person I/O if we want to apply the following projection:

{:friends [{:name projection/leaf}]}

We could introduce a PartialResult protocol, allowing to expose such data:

(defrecord Person [id]
  data/PartialResult
  (partial-result [_]
    {:id id, :friends (->Friends id)})

  data/Resolvable
  ...

  data/Transform
  (transform [this result]
    (merge result (data/partial-result this)))

While it's possible to do some automatic merging of partial-result into the return value of transform I fear that this might create some surprises if done implicitly. Although, if we limit PartialResult to map values (which might be a reasonable thing to do) we can derive some very simple semantics à la: "If transform return a non-nil value, merge the partial result into it."

Another point of note is the fact that, if a Person does not exist, just using the partial result will not expose that fact (although the friend list will be empty). But I guess that if users are made aware of this caveat it shouldn't be a significant problem.

Problem with Projection + Composition in Collection

Applying a projection to a seq whose elements are wrapped by one of claro's composition functions does not have the desired result, e.g.:

(claro.engine/run!!
  (claro.projection/apply
    [(claro.data.ops/then
       (reify claro.data/Resolvable
         (resolve! [_ _]
           {:a 1}))
       identity)]
    [{:a claro.projection/leaf}]))

Instead of [{:a 1}], this produces:

java.lang.IllegalArgumentException: projection template is a map but value is not.
template: {:a <leaf>}
value:    :claro.data.tree.utils/this-should-not-happen

Specifying unlimited cost of resolution

More of a question than an issue really but I would like to be able to specify that resolution should never fail due to the cost of a batch. It seems that :max-cost does not support any way to specify this (although maybe one of the Java infinities would work?)

Also, it's possible that wanting to do this is wrong in the first place. I would be interested to know more about how the cost concept is supposed to be used. In my case, the number of Resolvables is related to database rows in a xn fashion (say x = 3 resolvables produced per database row, and n database rows), but since the n is variable I could never fix a static :max-cost without counting the rows first.

Union Projection on Mutation

The union projection, internally, creates a seq of the partial projections it wants to eventually merge. This means, that the initial value will appear multiple times within the tree, causing the Mutation constraint of "there can only be one per resolution" to fail.

This could be fixed by allowing multiple identical mutations to appear within the tree (which makes the aforementioned constraint kinda useless) or by implementing a special union projection tree node.

(The latter might also address some performance concerns I'm having about the union projection.)

Caching Middleware

Hi Yannick,

I was having a quick look through Claro, and wondered if you have plans to open up the caching strategy (currently you're using an in-memory, transient hash-map I believe).

If I have a couple of JVMs fetching data it would be useful if they could both share the work they're doing via something like Redis or Memcached. Obviously, Clojurescript users in a browser environment won't be hitting up Redis, but in a backend service this could be quite useful.

Maybe a protocol so I can implement my own external caching strategy on top of Redis?

Thanks for open sourcing Claro!

P.S. I guess with Onyx mentioning you in their docs you may get a few questions like mine.

'case' vs. 'case-resolvable'

case does class-based dispatch before resolution. A better name might thus be case-resolvable, freeing case for class-based dispatch after resolution. (Which, incidentally, is the behaviour of conditional fragments in GraphQL.)

Circuitbreaker Middleware

An engine middleware could use e.g. resilience4j to add circuit breaking to claro.

This might require exposing which resolvables use which datasource to not only have circuit-breaking on a per-resolvable basis (which could be the default, though).

As for all middlewares that require extra dependencies, I'd prefer a separate repository over integrating it into this one.

Deferred/Streaming Resolution

Claro should allow functionality akin to GraphQL's (proposed?) defer, stream and live directives [1], i.e. return incomplete results that get completed asynchronously. Ideally, this is offered by engine middlewares but there are some things to consider.

Access to the full Engine

To let a middleware run resolution, it needs access to the full engine, i.e. one also including all middlewares on top of the current one. This could be achieved by exposing a dynamic binding or lookup function (e.g. claro.engine/current) to the resolver or by (conditionally) injecting the engine into the environment using a well-known key (e.g. :claro/engine).

Dedicated Value Types + Projection

Resolvable parts have to be "marked" as deferred or to-stream, e.g. by wrapping them in dedicated defer/stream records. This can also be elegantly done using projections:

{:id      projection/leaf
 :name    projection/leaf
 :friends (projection/defer [{:name projection/leaf}])}

For stream resolvables it's probably necessary for them to implement explicit streaming functionality.

Push Mechanism

A callback mechanism has to be used to deliver the results of asynchronous resolution. Possible parameters for such a function could be:

a unique ID describing the location of the value to deliver,
the result value,
the environment (which might contain handles on some kind of push transport).

Race Conditions

Deferred values can be nested, so one has to be careful to only push nested results once the upper level has been finalised.

Batching

Deferred values of the same class should be resolved in batches if possible (i.e. if they implement BatchedResolvable), or individually if not.

Keeping these points in mind, deferred resolution is most likely a multi-stage process:

Run normal resolution, wrap values in defer/stream records. Return the result immediately.
Collect all deferred values, probably tuples of (deferredID, resolvable).
If there are any, resolve the deferred tuples asynchronously, processing results using a callback mechanism. Here, similar resolvables can be batched.
Continue with step 2 for each result.

This requires wrapping the full engine, though, not only the resolver part. The result has to be inspected and further actions have to be initiated.

[1] https://medium.com/apollo-stack/new-features-in-graphql-batch-defer-stream-live-and-subscribe-7585d0c28b07

Dedicated Error Value

While it's already possible to use e.g. Manifold's d/catch to react to errors it might make sense to introduce an error value that can be produced by Resolvables.

(defrecord ProfilePictures [id]
  data/Resolvable
  (resolve! [_ {{:keys [user-id]} :auth}]
    (d/future
      (if (not= user-id id)
        (data/error "cannot access profile pictures of other users.")
        ...))))

This way, we could return a partial tree with error leaves (and let the client handle them) or have a postprocessing step that collects all errors and exposes them at the top-level or within the affected subtrees.

(We might need to make projections error-aware, though, so they don't complain about error/leaf values when expecting a nested one.)

[ClojureScript] Deferred Implementation

What implementation of deferred values shall be used in ClojureScript?

Possible Candidates:

core.async's channels,
promesa's promises.

Common, implementation-independent parts of claro.engine.runtime should be extracted into claro.runtime with claro.engine instantiating Clojure/ClojureScript engines using reader conditionals.