mattkrick / cashay Goto Github PK

View Code? Open in Web Editor NEW

453.0 453.0 28.0 668 KB

:moneybag: Relay for the rest of us :moneybag:

License: MIT License

JavaScript 100.00%

cashay's Introduction

Hi there 👋

I'm Matt. I build things to help people work together.

Here are a few of my favorite things to geek out on:

🤖 Machine Learning / Computer Vision / LLMs
🌎 Distributed/Decentralized systems like WebRTC & bittorrent. Not crypto.
📝 Collaborative editing (OTs, CRDTs, everything in between)
🧑‍💻 Single Page Apps (React, GraphQL)

cashay's People

Contributors

Stargazers

Watchers

cashay's Issues

Write example recipe for SSR

This is something I wished I would have asked you about when we were laptop-to-laptop last week.

A universal app has two stores, one for the SSR and one for the client (which can be delivered to the client from the SSR). Following the pattern from Meatier and Action, these stores are created in:

src/server/createSSR.js
src/client/client.js

For testing purposes, our application is currently pulling in the Cashay singleton from src/client/client.js into its universal components. This does Bad Things™ during a production build (npm run build:server).

What's the recommended pattern here?

Should Cashay be eliminated from universal routes (ick) as it's really a client cache?
Should the Cashay singleton be moved into src/universal/... and should it import the appropriate store (exported by src/server/createSSR.js or src/client/client.js) depending on which context it's running in?
Something else?

pending queries

Going along with #7, query pruning should include a union of the store + pending queries.
This needs more thought.

As a first idea, we could treat a fetch like an optimistic update. When a fetch request is made, we optimistically update the store with what we expect back. We can gather this info from the queryString.schema once we add scalars (and non-entity objects) to it.

if the fetch fails, we roll back the store & then that same query can try again.

Cache mutation strings on activeComponents + arg keys (currently just activeComponents)

Currently, to be more performant, if you pass in more arg keys the second time you call mutate, it won't handle arg definitions for the new ones. The solution is to pass in all args, even the ones that don't have a value, (eg options = {varaibles: {foo: 1, bar: undefined}}). This works, it's more performant, and less memory intensive, but only negligibly . By caching on the unique combination of activeComponents + arg keys, it makes for a nicer UI, at the expensive of a little performance.

If performance becomes a problem, we could add a bypass option, but i can't imagine this is the reason why someone doesn't hit 60fps.

query aggregation

relay has it's own language, parser, and printer, and containers to handle query aggregation.

If I cache every queryString that comes through in a pending bucket like #8, I could reduce the required payload, but I'd still get a few queries that just request a field or 2.

To keep it out of the react ecosystem, I could do 2 things:

have a batchDelay that takes an int, meaning that only 1 query in sent every x ms. This could be stored on the Cashay instance or in the query options.
have a rollCall that takes a number or array of component names that is set as a query option. Every child component would have a rollCallResponse flag. When every component is accounted for, the query is sent to the server. This could eliminate a ton of boilerplate, but requires a parent to know about all the children under it & is prone to user error since that number would have to be changed whenever a component was added. walking may be the only robust solution here to ensure a single http call. But, that's only important for http stuff, websockets don't need aggregation.

denormalize?

assuming the general way to grab data is through a mapStateToProps, I wonder if it's better to denormalize the result so folks can get the data just like they asked for it.

denorming would be a penalty hit, since components will refresh when anything inside what was denormed changes. i'd also have to implement reselector since it's an expensive op.

i don't think it's unreasonable to have devs access a nested item directly. it'd probably make their code a lot cleaner than relays.

again, the trick is pagination.

if a query asks for the first 10 after cursor xyz, should the dev request the whole array?

for window pagination, they'd only want 10, so i'd have to write a function to only return an array <= 10 items.

for infinite scrolling pagination, they'd want to pull the whole array & append on the new things.

this might just be left up to the dev to decide

Establish prior art

Start chats with @gyzerok. Even though https://github.com/gyzerok/adrenaline has a different approach, the end goal is more or less the same.

Read relay source code wrt how data is stored: https://github.com/facebook/relay/tree/master/src/store

Stay up to date on http://hueypetersen.com/, it's the best blog wrt this stuff... even if @eyston ignores my emails 😭

Chat with @arunoda regarding future plans for https://github.com/kadirahq/lokka. The store is different, but the API is very similar.

Feel free to share & discuss any issues regarding GraphQL client caches here.

RFC: GraphQL Error handling

There are 3 groups of errors we can get when we talk to the GraphQL server:

Fetch error: The url isn't found, or the GraphQL endpoint requires an authToken & the authToken is bad, etc. Basically, can't even call graphql.
Parse/validation error: The query string was malformed, or the wrong/incomplete variables were sent to the server, etc.
User-defined field error: The user tried to access a field but didn't have the right authorization, or there was a database error, etc. For example, they tried to execute createUser but the user already existed.

Cashay needs to handle all 3 types, but also allow for a variety of back-ends. For all 3 types, all errors should eventually end up in the redux store, which will always be an object to allow for a wide range of user-defined errors. Doing so gets rid of the serialization complexity of Error types. For example:

store.getState().cashay.error = {
  _error: "Failed Login",
  password: "Incorrect Password"
};

For type 1, Cashay will look at the HTTP status code (probably 400, 401, or 403) & create a default string. Eg

store.getState().cashay.error = {
  httpError: "401",
  message: "Not authenticated"
};

For type 2 & 3, the error comes from GraphQL, so it will return an errors array. Since type 2 cannot be controlled by the back-end developer (eg not thrown within a resolve function), it'll be a true errors array. Otherwise, the resolve function could throw an error. To determine if the error is from GraphQL or user-defined, I like to do something simple like errors[0].indexOf('{"_error"') === -1. If it is user-defined, we set the entire cashayDataState.error to the parsed error message. Otherwise, we set it to: {errors: []} and the user can do whatever they want with those. Of course, this is just the default behavior. If they want to do something different, they can write a custom errorHandler in HTTPTransport.

Generated pagination queries do not include existing variable definitions

// Initial query sent
const flightsQuery = `
  query (
    $fromDate: String!
    $toDate: String!
    $first: Int
  ) {
    flights(
      fromDate: $fromDate
      toDate: $toDate
      first: $first
    ) {
      id
      cursor
      status
    }
  }
`

// Load more button handler
const loadMore = () => setVariables(currentVariables => ({
  ...currentVariables,
  first: currentVariables.first + 10,
}))

// Follow up query (generated)
{
  flights(
    first: 20, 
    fromDate: $fromDate, 
    toDate: $toDate, 
    after: "sk884-osl-arn-20160608"
  ) {
    id
    cursor
    status
  }
}

Support Directives

Looks really easy, but I've just never used em, so I don't know how to tear em apart.

do we want to set item / memory limits on the cache?

For example, if I'm using cashay in a mobile-web app, I may not want the cache to grow and grow forever.

Do we want some kind of LRU expiry scheme?

Do we want to allow developers to set an expiry time on a query / item?

Roadmap to Alpha

What is in alpha:

GraphQL schema
API
- cashay.query
- cashay.add
- cashay.update
- cashay.delete
- cashay.subscribe
State.cashay
- Entities
- Results
- Denormalized queries
Cashay singleton
- Dependencies on Entity[Collection] & Dependency on those objects
- Default transport that can be overridden by method options

What's not in alpha

Pagination support (although the store will be able to hold this correctly)
Cashay Compose (multiple graphql endpoints in a single query)

Support file uploads

I've never uploaded a file via graphQL, so if someone else wants to take a crack at this, have at it. Otherwise, it'll be low priority until the need comes up in a sprint for work.

Webpack client schema loader

During development it would be very nice if one could load the client schema using a webpack loader:

const clientSchema = require('schema!./schema');

Step #1: Normalize: the spec

The end result of normalization should be 2 objects: result (what relay calls rootCallMap) and entities. An entity is defined as an object that contains an id(#3 may eliminate this constraint in the future as well as request an id when available). This is similar to normalizr.

Each entity represents a concrete GraphQL Type (excluding abstract types like Interface and Union).
Each entity stores data dependent on the arguments allowed. Those arguments fall into 2 categories:

paginationArgs: before, after, first, last (skip and limit are reserved as they may come later). This may be modified by passing in a paginationWords object in the options that maps the word used to the meaning, making pagination possible for any 3rd party graphQL endpoint. For example, if a 3rd party graphQL endpoint is in spanish:

options.paginationWords = {
  before: 'antes',
  after: 'despues',
  first: 'primero',
  last: 'ultimo'
}

regularArgs: everything else

If the TypeDef (from the introspected schema) allows for arguments, the entity must allow for any combination by using a sorted, stringified regularArgs as a unique identifying object key.
If the TypeDef does not include pagination keywords, each object key will have a value that is the scalar, object, array, or id reference to the containing object.
If the TypeDef does include pagination keywords, then the object key will contain up to 2 of the 3 array names front, back, full.

For example, assume a Post has 1 author, a language-dependent title and many comments.
A query allPosts takes an optional published boolean argument. the field title takes a language argument. The comments handles a deleted argument as well as backwards and forwards pagination and therefore accepts any of the paginationWords. Merging normalized responses may yield the an example shown below.

The benefits:

Works with ANY graphQL endpoint (no relay-specific endpoint required)
Can request less data
Faster lookup times
Immutable structure with invalidations triggered only when the entity Type changes

normalizedData = {
  entities: {
    Post: {
      '0': {
        id: '0', // a field without args can return a scalar
        author: '3', // if the TypeDef exists in `entities`, a scalar is interpreted as a lookup
        date: { // a field without args can also return an object
          date: 'Feb sizzecond de 2016',
          time: '10:22 PM'
        },
        title: { // a field with args returns an object wth every arg combination tried
          "{language: 'english'}": 'How to dance like a tool',
          "{language: 'spanish'}": 'Como bailar bien pendejo'
        },
        comments: { // a field with regularArgs and paginationArgs: first layer is regularArgs
          "{deleted: true}": { // the 2nd layer is an array based on whether `first` or `last` is used
            front: ['10','11'], // here is the result of a query calling {first: 2, after: null}
            back: ['12'], // here a query called for {last: 1, before: null}
            // if the 2 arrays merge, eg {last: 2, before: '12'} the front and back are removed in favor of "full"
            // full: ['10','11','12'] 
          }
        }

      }
    },
    Author: {
      '3': {
        name: 'Paul Jones'
      }
    },
    Comment: {
      '10': {
        id: '10',
        content: 'Great!',
        __cursor: { // a list of cursors is stored on the item
          "{deleted: true}": '10', // each type + arg combo must have a shared cursor
          "{deleted: true, orderBy: 'timestamp'}": '12376663' // cursor from other query (not shown)
        }
      },
    }
  },
  result: {
    allPosts: {
      "": ['0', '1', '2'], // no args given
      "{published: true}": ['0'],
      "{published: false}": ['1', '2']
    }
  }
}

Consider making `cashay.mutation` return a promise

It's possible that we may want to trigger a side effect after a mutation returns.
Normally, I'd suggest listening for the difference in the redux state tree, but a mutation doesn't guarantee a change to the redux state. So, there's no real way to know if the server call every returned. However, it does guarantee a server call. So, if cashay.mutate returns a promise, we can do fun things like redirect, or write to localStorage, etc.

This flexibility gets dangerous since folks would be able to write anti-patterns (eg query after a mutation completes instead of in the mSTP) but that's a documentation problem.

If we accept that a promise has to be returned, should a GraphQL error be a promise rejection?

Side note: the promise will probably have to resolve before calling dispatch since the dispatch will likely re-render the component it was called from.

throw error if `queryString` includes a cursor

Cashay uses cursor-based pagination _and assumes that you start from the beginning or the end of an array of documents._ This has a ton of benefits:

We can detect BOF, EOF, which is equivalent to hasNextPage, hasPreviousPage.
It's a best practice
- there are times you might need an array starting at a certain index. For example, you search for a string in a slack chat. The result is an array that has 10 chat lines above & 10 chat lines below. The way to accomplish this would be to perform the search on the server & return array.slice(n - 10, n + 10).
it makes mutationHandlers much more efficient. That's because a paginated array in a query is a subset of the array stored in the redux state. For example, you may have a query that wants the first 3 docs, but your local state has the first 10 docs. So, when your mutationHandler receives [1,2,3] and mutates it (eg deletes 1 and adds one, so the result is [1,3,4], I can simply splice out the old & splice in the new. This is very difficult if the array doesn't at the beginning or end (that's because i can't determine the start & end index based solely on the length of the array in the query).

For these reasons, when initializing a query, if before or after are present, an error should be thrown. Forcing best practices. it's a good thing!

Cashay

Hi!

When moving Action over to Cashay, we noticed updateSchema.js never exits. We using rethinkdbdash and it (annoyingly) creates a connection pool we must drain before it will allow the node process to exit. Could we have updateSchema.js call a callback when its finished with its work? A la:

From:

$ ./node_modules/.bin/babel-node node_modules/cashay/updateSchema.js src/server/graphql/rootSchema.js build/graphqlSchema.json

To:

$ ./node_modules/.bin/babel-node node_modules/cashay/updateSchema.js src/server/database/rethinkExit.js src/server/graphql/rootSchema.js build/graphqlSchema.json

Or similar?

Denormalized responses

Currently, deps & denormalized responses are stored in the Cashay singleton instead of the state store. This has the following benefits:

Can persist the store in local storage because denormalized responses use a variable object (memory reference) as a key. strigifying to local storage would serialize & that key wouldn't do anything, meaning the reference would be dead.
denormalized data is a calculation. it is derived from state, it isn't state. so, fundamentally it probably shouldn't belong.
fast! no unnecessary dispatches

It has the follows cons:

time traveling (or undoing) a mutation will cause a query to point to a denormalized structure that isn't correct. that's because both the denormalized Response & the normalized data are shifted +1. The time travel would cause a -1 to the normalized data, but not the denormalized Response. I can't for the life of me think of a real world scenario where this would be bad. BUT, if one is found, it'd be easy enough to write a plugin that also saves a history of denormalizedResponses.

throw error if no `component` is supplied to a query & the query has mutationHandlers

putting a querystring into the alias of a graphQL query is bad news bears. Technically, i think it's a bug in the GraphQL parser, but it's a problem that only cashay would ever run into.

Step #1: Normalize: game plan

Normalize a GraphQL response

Inputs:
- queryAST (parsed query string)
- clientSchema (introspected schema)
- variableValues
- paginationWords
- GraphQL response
Outputs:
- normalizedResponse

Merge normalizedResponse into store

Inputs:
- normalizedResponse
- store.getState().cashay.data
Outputs:
- store.getState().cashay.data

Implement `NOT_NULL` and `INTERFACE` schema kinds

I'm currently evaluating a few alternatives to Relay and I wanted to see if I could drop cashay in and get a feel for it. Right now it doesn't handle NOT_NULL and INTERFACE kinds properly.

I'm talking about this function:

cashay/src/normalize/denormalizeStore.js

Line 80 in 3a4bfbc

const visit = (subState, reqAST, subSchema, context) => {

From the GraphQL docs about NON_NULL:

// http://graphql.org/docs/api-reference-type-system/#graphqlnonnull
A non-null is a kind of type marker, a wrapping type which points to
another type. Non-null types enforce that their values are never null
and can ensure an error is raised if this ever occurs during a
request. It is useful for fields which you can make a strong guarantee
on non-nullability, for example usually the id field of a database row
will never be null.

And here's the thing for INTERFACE:

When a field can return one of a heterogeneous set of types, a
Interface type is used to describe what types are possible, what fields
are in common across all types, as well as a function to determine 
which type is actually used when the field is resolved.

Step #2: Mutations: Spec

query takes care of the R in CRUD. For the rest, we need mutations.
For that, cashay will have 3 methods: add, update, delete.

cashay.add(mutationString, options)
Given a standard graphQL request string, it's pretty easy to add to the client cache. Just look up the Type and put it in that bin. The difficult part is how to update queries that reference the object.
For example, grabbing the 10 most recent Posts. If I add a new post, I may want to prepend it to the array if I have infinity scrolling, or I may want to ignore it if I use pages with a fixed count. If I sort by something like reputation, I may even want to inject it in the middle (currently impossible with relay). Additionally, I may have something like a sidebar of "top 5 posts by rep" and a main area of "top 5 most recent posts". So, I'll need an option that can programmatically place an item based on the current array and the item.

options.affectArray  = {
  "": () => 0,
  "{orderBy: 'reputation'}": (array, item) => {
    const placeBefore = array.findIndex(obj => obj.reputation < item.reputation);
    return array.slice().splice(placeBefore, 0, item);
  }
}

options.affectObject  = {
  "": (currentObject, item) => return currentObject.rep > item.rep ? currentObject : item,
}

This solves the same problem as relay's getConfigs.rangeBehaviors, except we derive the Type info from the clientSchema and a graph data structure isn't required. This makes the assumption that (Type, args) => <Array>. If I had hardcoded queries like get5RecentPosts and get10ReputablePosts, which don't have args and both return a List of Posts, then I'd be forced to apply the same function to each. This is OK since hardcoding the limit and orderBy is not a best practice. As a workaround, such queries could add another argument to each. The arguments in affectArray don't need to include all args, but it must be a subset to call the callback.

It'd be possible to handle objects, too, but the user-defined callback would have to return the single object instead of an array (eg getMostPopularPost). Since it's possible an object and an array could share arguments, This change in the API warrants its own option to keep things very clean and unambiguous.

cashay.update(mutationsQuery, options)
Updates are easy. The mutation lookup in the clientSchema tells us the collection, and the id is provided by the user.

cashay.delete(mutationsQuery, options)
Deleting things is the hardest, because you can delete an item & all of it's dependents/associations (what relay calls NODE_DELETE) or a single item & leave the associations (RANGE_DELETE). Relay's example of RANGE_DELETE is to remove a tag from an item. Unless I'm missing something, this would be considered an update given my current normalization schema (#13). If I wanted to remove a tag from a todo, I'd just write a removeTagMutation and call that, which would return the item without the tag (and I could minimize that query before sending it).
If I wanted to remove a tag, which would trigger a removal from multiple todos, then I would delete Tag:id, and then traverse the tree removing any reference to Tag:id. Optionally, if the clientSchema shows that Tag:id is mandatory, then I could remove the parent, too, which would trigger a recursive check & delete.
Afterwards, probably within a setImmediate, I'd need to garbage collect the orphaned leaf nodes.

Finally, removing an item from an array that allows length of 0 is fine. But removing an item from a query that returns a single object (eg getMostPopularPost) would not be OK to leave blank. So, just like add, I'll need something like:

options.affectObject  = {
  "": (objectStore) => Object.keys(objectStore).reduce((reduction, item) => item.rep > reduction.rep ? item : reduction)
}

OPTIMISTIC UI
Again, this will be broken out into 3 different actions.
To add something, all we need is item and either an array + index, or object.

Creating the item:
Ideally, the entire item is sent via a variable, but this isn't always the case, since creating the item might require data that is only on the server. Therefore, I think the easiest thing for add is to have an options.optimisticItem. This could take either the object itself, or a function to create the object. Since it'll only run once per batch of variables, it'll be simpler to accept an object (or an executed function that returns an object). For example:

options.optimisticItem = {
  title: 'Hello',
  id: '123',
};

instead of handling this in a global redux middleware, each CashayTransport will have its own optimistic handler that assigns a transactionID.

next(Object.assign({}, action, {meta: {optimistic: {type: BEGIN, id: transactionID}}})); //execute optimistic update
  const socket = socketCluster.connect(socketOptions);
  socket.emit('graphql', payload, error => {
    next({
      type: type + (error ? _ERROR : _SUCCESS),
      error,
      payload,
      meta: {
        synced: true,
        optimistic: error ? {type: REVERT, id: transactionID} : {type: COMMIT, id: transactionID}
      }
    });
  })

Add arguments to query API, determine argument interface in the store

Example: https://facebook.github.io/relay/docs/thinking-in-graphql.html#populating-the-cache

Dependency handling

No matter how you slice it, you'll always use about O(2n) space for a client cache. If queries were calculated on every dispatch, it'd be n + denormalized queries subscribed to from redux. If queries are cached/memoized, it'd be n + denormalized queries subscribed to from redux + stale queries + dependencies. The goal is to minimize the memory footprint, so first minimize stale queries and then dependencies.

Dependencies could be at the state level (ie what redux does), the collection level (eg Posts), the entity level (eg Post.123), or the property level (Post.123.updatedAt). If it's not at the property level, it's a heuristic & you might be invalidating when not necessary. However, the property level would take up more memory & more computations to maintain the dependencies. So, I opt for the entity level as a nice compromise. Queries are only invalidated when the objects they care about are invalidated.

There are 2 types of dependencies, normalized and denormalized. They are easily understood as an M:N relationship:

// 1 response depends on many normal objects
// 1 normal object is depended on by many responses

// dependency granularity is on the entity level, not the prop level (eg if dependencyKey1 doesn't use updatedAt
// and updatedAt gets changed, then dependencyKey1 is still invalidated)

//option #1: like a sql DB, then run dbIsh.filter to get what you need
const dbIsh = [
  {normalizedLocation: 'Post.123', denormalizedResponse: dependencyKey1},
  {normalizedLocation: 'Post.123', denormalizedResponse: dependencyKey2},
  {normalizedLocation: 'Post.124', denormalizedResponse: dependencyKey1},
];

//option #2: 2 different objs
const normalizedDeps = {
  dependencyKey1: Set('Post.123', 'Post.124'),
  dependencyKey2: Set('Post.123')
}

const denormalizedDeps = {
  Post: {
    123: Set(dependencyKey1, dependencyKey2),
    124: Set(dependencyKey1)
  }
};

//option #3 store denormalizedDeps, calc normalizedDeps on the fly

//option #4 store normalizedDeps, calc denormalizedDeps on the fly

denormalizedDeps are used in the async _queryServer method. When new data comes back, it needs to be normalized & then that new normalized data needs to be compared to the old, possibly stale, normalized data in the state tree. If the new data is different, then the list of dependencyKeys needs to be added to a flush set.

normalizedDeps are needed because it's possible that the updated response no longer depends on a a part of the normalized state. an example might be when a mutation adds a new high score. The getTop5HighScores response now no longer depends on the 5th object because it got bumped. Unfortunately, determining what happens is really hard because it's in multiple user-defined functions & the denormalized query is mutated (a new deep copy isn't created for performance). So, the new, mutated object is normalized & deps are created. Then, the old normalizedDeps are compared to the new deps. For items that only exist on the old object, we go into denormalizedDeps and remove them. For items that only exist on the new object, we got into denormalizedDeps and add them.

Option 1 looks decent because it minimizes the amount of computation and memory, although the native filter function is dead slow & any filter has to run at O(n).

Options 2 trades memory for computation speed since it won't have to ever run a filter. JavaScript makes no guarantees about object lookup performance, but it's usually between O(1) and O(log(n)), so it sure beats a filter.

Option 3 is like option 2 except the normalized deps are calculated on the fly & then GC'd away by the JS engine. Unfortunately, calculating normalized deps requires walking the entire state tree, which gets really expensive, although if mutations are infrequent, not terrible.

Option 4 is like option 2 except denormalized deps are calculated on the fly. That means walking every normalized dep. The normalized dep tree should be much smaller than the state tree, but then the question arises of how to garbage collect old query + variable combos.

turn state to getState() in mutationHandler

The redux store isn't easily navigable for humans. For example, if a field has no args, you get the value. If it does have args, the values are stored in an object where the keys are JSON-serialized strings of the variables. If it's an array with pagination, it's even trickier (front, back, full, no BOF, EOF).

But since this is such a rare param to use in mutationHandler (eg next best from local store), it doesn't make sense to walk the whole freaking state if they're not going to use it. For example, if i have a query getTop5Posts that was mutated by a mutation that deleted the top post, then I'd want to find the next best local post to put in the state. If karma has no args, then i'd just do state.Posts.forEach(post => post.karma > highScore..., but that doesn't work if karma has args.

To create a human-readable state is a function like humanReadableState = (state, schema, variables).
Ideally someone can call getState() to achieve a human-readable state. We'll just pass in a factory function that has state, schema, and variables in the closure so they don't have to pass anything in.

Handle errors from mutations

Currently, errors are handled when they come back from queries, but not from mutations.
We need to change that!

RFC: option to only run certain queries in SSR

We could add something like cashay.query(queryString, {runOnSSR: false}) which would allow users to only run certain queries when rendering the HTML on the server.

My question is, would this ever be useful? Personally, I want to run every query on the server except those queries that are behind a login wall, but SSR doesn't see those anyways, so it's of no concern.

If you can think of a use case, let me know!

investigate webworkers

investigate whether it makes sense to use a webworker on a per-reducer basis (cashay) or whether it'd be better to just instruct the dev to build his entire store in a webworker.

pagination & pendingData

hasNextPage is really useful, but i don't like how the relay spec defines it. I think hasNextPage should simply answer if another document is present in the local cache after the provided cursor. By doing so, a user can go to their server & send down 6 documents instead of 5. When cashay receives those 6 documents, it puts them all in the state, but only gives the component the 5 it requested (along with a hasNextPage = true, if that was a desired field). Then, since we know there's at least 1 more local document, the "get more" button is still there in the view layer. When it's clicked, cashay receives a request for 5 documents after cursor 5. Cashay tries to get more documents locally, but only finds 1 document (6). Cashay modifies the request to fetch 4 documents (7,8,9,10) after cursor 6. Then the user's server will return 5 docs, thus repeating the cycle.

The benefit here is that, if the user so choses, they can get an extra document. By doing so, that document gets returned immediately without a server fetch. Then can animate it in while the other documents return, making for a reduced perceived latency.

If the user is a bit freak & really doesn't want to send that extra doc down the wire, they can send an undefined.
Then, Cahsay will store that undefined in the array. when the next request comes it, it'll ignore undefineds.

Pending data will probably store those fetches in something like this:

entities: {
        Post: {
          123: {
            // scalar
            updatedAt: null,
            // indices start at cursor index + 1
            // that way, if 2 requests come in with different cursors, we can quickly determine what we got
            // if no pagination args it's an empty 'full' array
            // else, it's an array with with COUNT numbers. If a cursor was provided, the cursor is found in the current array
            // and that cursorIdx + 1 + n
            // if it's from the back, it'd be cursorIdx - 1 - n
            comments: {
              front: [8, 9, 10, 11, 12]
            },
            // scalar with argument
            title: {
              "{language: 'spanish'}": null
            }
          }
        }
      },
      result: {
        getPost: {
          "{language: 'german'}": null
        },
        getAllPosts: {
          "{language: 'korean'}": {
            // if a request for front or back comes it, full will override it
            full: []
          }
        }
      }

Store denormalization failing to resolve object connections

// List page query
const flightsQuery = `
  query (
    $fromDate: String!
    $toDate: String!
    $first: Int
  ) {
    flights(
      fromDate: $fromDate
      toDate: $toDate
      first: $first
    ) {
      id
      cursor
      status
    }
  }
`

// Detail page query
const flightQuery = `
  query (
    $id: ID
  ) {
    flight(id: $id) {
      id
      departure {
        id
        airportName
      }
      arrival {
        id
        airportName
      }
    }
  }
`

// Store contents
"entities": {
  "Flight": {
    "dy310-osl-tos-20160608": {
      "id": "dy310-osl-tos-20160608",
      "cursor": "dy310-osl-tos-20160608",
      "flightStatus": null,
      "departure": "DepartureStatus:OSL2016-06-08T07:20:00Z",
      "arrival": "ArrivalStatus:TOS2016-06-08T10:10:00Z"
    },
  },
  "DepartureStatus": {
    "OSL2016-06-08T07:20:00Z": {
      "id": "OSL2016-06-08T07:20:00Z",
      "airportName": "Oslo"
    }
  },
  "ArrivalStatus": {
    "TOS2016-06-08T10:10:00Z": {
      "id": "TOS2016-06-08T10:10:00Z",
      "airportName": "Tromsø"
    }
  }
}

Attempting to read `data.flight.arrival.airportName` returns `null`

Use introspection instead of or in addition to a schema file

Instead of reading a file, it'd be handy if you could provide a graphql endpoint URL instead.

import { introspecitonQuery, buildClientSchema } from 'graphql/utilities';

fetch(graphqlURL, {
  method: 'POST',
  body: GraphQLUtilities.introspectionQuery
})
  .then(response => response.json())
  .then(buildClientSchema)
  .then(schema => /* do what you do now with the schema */);

what do you think?

Plans for colocation?

First just would like to say this project looks great! Just wondering if there are plans to make colocation possible - this is a huge feature of Relay. And if so, where might be a good starting point to look into support?

Make support for multi-part queries easier

Use case:

cashay.mutation('createFoo')
cashay.query(getFoo)
cashay.mutation('createBarForFooResult')
cashay.query(getBarForFoo)

Neither query needs to call the server until foo is returned from the first mutation. So, if the query includes a required variable & that is not present, bail without an error.

Consider a `setTransport` method on the singleton

The singleton is usually created where the store is.
That singleton may need authentication for its transport, eg

headers: {
        'Content-Type': 'application/json',
        Authorization: `Bearer ${authToken}`
      }

The user may not have that authToken yet, as they'll have to log in to get the token.
When they log in, they'll get a token, but the transport won't have it, so we'll have to replace the old transport with the new.
Today, we can do that like cashay.transport = new MyTransport().
That's not bad, but it seems kinda hacky since that property isn't explicit in the API.
At the cost of making the API bigger, I could add a method,
eg cashay.setTransport(new MyTransport()).

Side note: Internally, the mutation's refetch call will need to call this.getTransport() instead of this.transport so the stale one can be GC'd away.

Thoughts? @jordanh @simenbrekken

Write recipe for persisting/rehydrating redux store

This is a precursor to #23

Add tests for the Cashay class

right now we're kinda flying blind. All the supporting modules are pretty well covered, but we gotta write some tests for the query/mutate methods & all the dependency tracking.

Add a HTTPTransport class

API
import {HTTPTransport} from 'cashay' // allows for tree shaking

http is identical to relay's API
const httpTransport = new HTTPTransport('/foo', init)

array handling: deleting and moving

currently, merging arrays works great for newsfeeds, but not great for things like a kanban. That's because when arrays are merged, they follow a basic set of rules:

if the document already exists in the array, overwrite the old doc fields with new ones, maintaining position
otherwise, append the new doc to the array

For a kanban, we'd expect different handling. For example:

if something is deleted, remove it from the original array
if something is moved, move it

Currently, this is manageable by using hash tables {1: {foo: 'bar'}}, but that requires some back-end schema changes. It'd be nicer if it worked with arrays, too.

Attempt writing a webpack plugin to create the clientSchema

The loader is great, I love it, but i'm not in love with it.
I think using a webpack plugin to create the schema might be a little cleaner. That way, instead of having a loader string in your require statement, you'd just have something like
import clientSchema from '../build/clientSchema.json.
And then in the webpack config, maybe something like

new CashayPlugin({
  path: path.join(root, 'build'), 
  filename: 'assets.json',
  graphql: 'graphql/graphql.js',
  rootSchema: 'server/graphql/schema.js',
  oncomplete:  'server/utils/drainPool.js'
})

pagination

graphql has no concept of pagination, to it, it's just a bunch of normal everyday args.
cashay should be opinionated, yet DB agnostic.
RethinkDB offers 3 recipes: https://www.rethinkdb.com/docs/cookbook/javascript/#pagination

skip and limit (good)
slice and limit (better)
between and limit (best, uses secondary index)

To achieve a between and limit, we'll need:

an array of args, which will be a superset of the orderBy args.
a document ID for a cursor cursor (defined as after or before)
a limit (defined as first or last)

cashay needs to treat pagination as a child of the query args.
eg 2 queries:

query 1 is for 1 food articles that isn't expired ordered by expiration date, skipping the first 3
query 2 is for 2 food articles that isn't expired reverse ordered by expiration date, skipping the last 2

{
  // args = {expired, orderBy, first, last, after, before} the last 4 are removed from the keys 
  // since they are reserved pagination words 
  getSomeFood: {
    // non-pagination args stored as the map KEY
    "{expired: false, orderBy: 'expirationDate'}": {
      // only used if 'after' or 'first' is an arg
      front: [undefined, undefined, undefined, 'cheese123'],
      // only used if 'before' or 'last' is an arg
      back: [undefined, undefined, 'noodles22', 'cheese123'],
      // when front & back converge, they are replaced with a 'full' array & 'hasNextPage' becomes true
      full: [undefined, undefined, undefined, 'cheese123', 'noodles22', undefined, undefined],
    }
  },
  // All food does does not require pagination (since it's ALL)
  getAllFood: {
    "{expired: false, orderBy: 'expirationDate'}": {
      full: ['pizza', 'pie', 'lemons', 'cheese123', 'noodles22', 'queso', 'chorizo']
    }
  },
  // A single item is presented as on object, not an array
  getFirstFoodItem: {
    "noArgs": 'pizza',
    "{expired: true}": 'spaghetti'
  },
  getLastFoodItem: {
    "noArgs": 'chorizo'
  }
}

V8 currently starts deoptimizing sparse arrays at 100,000 documents (IIRC), so even a few thousand undefineds is OK.

The final step is convergence from front to back. As seen above, item 4 is cheese123, which is in both arrays. That means we are guaranteed that there are 7 items total. If we know the item count and cursor, we can easily calculate hasNextPage and hasPreviousPage. No need to store it.

We could either set a flag like hasConverged = true or we could create a new full array. I like the new 'full' array idea because otherwise I'd have to create a new array for every fetch, and being immutable, that aint cheap.

for array queries that do not trigger pagination by using the 4 reserved keywords (before, after, first, last), results go directly to full.

...unless I'm missing something huge where this is gonna make a turd, this is a solid 3.5x easier than how relay works.

query pruning

Given a query:

`query {
  getComment(id: 'foo123') {
    id,
    body,
    createdAt
  }
}`

and a store that contains all required fields, it should not send a request to the server.

Given the same query & a store that contains the doc but not body, it should only fetch body. The id field, should be added locally by the parser in a special exception if an arg has key == 'id'.

This will require adding scalars and non-nodes to the queryString.schema.

cashay-loader needs an exit script

@jordanh we need to drain the pool in the loader. check out the cashay branch & try build:client

Support normalizing non-nodes

not sure if there's a benefit to normalizing nested objects that don't have an id. There are 2 cases where this could happen:

the nested object has an ID and the query didn't request it. in this case, the dev wouldn't use the nested object without it's parent. An updated query that doesn't include the nested obj wouldn't cause any harm.
The nested object doesn't have an ID, which means it doesn't have a relation to any other entity. In that case, no other table could mutate it without going through its parent...

Seems like normalizing non-nodes doesn't provide any value. But, relay does it & they must have a good reason...

serviceWorkers and local persisted storage

with the goal of this being offline-first, investigate whether serializing the redux store into the localStorage is sufficient.

prior art: http://www.pocketjavascript.com/blog/2015/11/23/introducing-pokedex-org

advanced minimization (low priority)

For each nested entity, x1, x2, ...xn the complexity of the request is the sumproduct. For example, getPosts { comments { replies } } if we got 5 posts and each post had 10 comments and each comment had 1 reply, we'd have 50 comments and 50 replies.

Currently, if Cashay gets the above query & the only thing missing is an updatedAt field on the reply, i'd still have to fetch everything because since that reply isn't fulfilled, then the comment isn't fulfilled, which means the post isn't fulfilled. GraphQL doesn't know how to say "oh, you just need 1? lemme get that".

This could be avoided if each entity has its own query function like getPostsByIds(ids: ['123']).
Currently, to minimize a query, a field is removed if and only if that field is satisfied for every entity in the List.
Cashay could be smart enough to acknowledge that we have the id of everything we need & create its own series of queries.

First, the dev would have to make a series of queries that accept an array of IDs. The could be assisted by a function that takes in the schema, but ultimately, since database resolves are involved, it'd need a human touch.

Next, during the minimization part, we detect if the number of missing docs is <= requestedDocs -1. If so, ditch that part of the minimized query & follow it up with a query just for the id.

Support subscriptions

Support TTL on a per-field basis

Every prop on the server GraphQL schema should be given a __ttl field that is automatically requested by cashay. When received by the client, an expiresAt field is generated on the parent object and the earliest time trickles up to the parent array and parent query, etc. It's run during the GC cycle, about every 5 mins. Removing it from normalized data, which invalidates the denormalized response and triggers a series of refreshes that only occur during a redux listener call, so thing out of the view stay deleted. That means no unnecessary data is fetched and we can store data in a persisted state. It also fails gradually, so it's "progressive"

CC @wenzowski

Change babel strategy (low priority: for HUGE production apps only)

Right now babel is used to write a function call that will create a normalization schema. This is useful for avoiding the large load of the clientSchema, but these function calls can be 10x bigger than the fragments they replace.
A very basic app has a clientSchema that is 1.5Kb gzipped (removed description, etc. fields from the introspectionQuery). Even if a production app were 10x bigger, it'd still be fine to ignore.

HOWEVER, if it's facebook-sized, a 100Kb clientSchema could exist. In this scenario, I could use babel + webpack.
Step 1: for each type in the clientSchema, write it to a .json where the name is type.name.value.
Step 2: In babel, parse the queryString into an AST. Make a list of all the types used. Map those to require statements where each one add itself to a prop of a types constant.
Step 3: Reference the types constant directly.

This is superior to the current (relay) strategy in a few ways:

it eliminates bloat of each babelfied TaggedTemplateString
it eventually loads the entire clientSchema onto the client, loading each type only once
by loading per-require, if the client needs the whole clientSchema (eg graphiql), it can request all requires and only receive the diff, reducing transmission
webpack can handle where to split the requires per page

Support defaultValue

create reducer factory

currently, the entire cache is stored in a cashay reducer. This won't scale since any change to any document causes the entire cashay tree branch to be rewritten.

Instead, it should act similarly to a store. for each key in entities it should create a reducer. that way, if only 1 type changes, we only rewrite 1 branch.

not sure, but the technique might be employed here: https://github.com/rackt/redux/tree/master/examples/real-world

mattkrick / cashay Goto Github PK

cashay's Introduction

Hi there 👋

cashay's People

Contributors

Stargazers

Watchers

Forkers

cashay's Issues

Recommend Projects

Recommend Topics

Recommend Org