vektah / dataloaden Goto Github PK
View Code? Open in Web Editor NEWgo generate based DataLoader
License: MIT License
go generate based DataLoader
License: MIT License
go run github.com/vektah/dataloaden UserLoader int ./getting_started/gqlgen-todos/graph/model.User
unable to gofmt: /Users/Luca.Paterlini/Documents/Save24thJuly2020/STUDYFUN/go_study/utils/gqlgen/dataloader/userloader_gen.go:6:1: expected 'IDENT', found 'import'
exit status 2
Im trying to generate dataloader that uses struct type as a key instead of primitive type.
Im using gqlgen
and all my models are generated automatically. My project structure:
internal/
graphql/
generated/
executor_gen.go
models_gen.go
...
models/
loaderkey.go
Contents of loaderkey.go
:
package models
type LoaderKey struct {
ID int
Fields []string
}
Command to generate dataloader:
cd internal/graphql/generated
go run github.com/vektah/dataloaden \
AuthorLoader \
*my.site/project/internal/graphql/models.LoaderKey \
*my.site/project/internal/graphql/generated.Author
Dataloaded succesfully generates authorloader_gen.go
dataloader. But there is a problem:
// Code generated by github.com/vektah/dataloaden, DO NOT EDIT.
package generated
import (
"sync"
"time"
)
// AuthorLoaderConfig captures the config to create a new AuthorLoader
type AuthorLoaderConfig struct {
// Fetch is a method that provides the data for the loader
Fetch func(keys []*LoaderKey) ([]*Author, []error)
// Wait is how long wait before sending a batch
Wait time.Duration
// MaxBatch will limit the maximum number of keys to send in one batch, 0 = not limit
MaxBatch int
}
<...>
As you can see fetch func using my LoaderKey
type, but package is not imported. How to fix this?
Lock is released between Clear
and Prime
If other goroutine do Load
or Prime
during this period, the value will be not used.
Panics that happen in the fetch function are not recovered, which means that a panic takes down the entire process. This could be something that I handle myself in every fetch function, but it seems like something that the generated code should handle.
I can think of a few approaches:
I think I'd lean toward the first, but thought I'd start a discussion. I'm happy to open a pull request once we decide on an approach.
Hi,
Thanks for the good work.
I was wondering, how would one proceed to have pagination with dataloaden (or dataloader more globally) ?
For instance with ordersByCustomer
, we currently have :
// 1:M loader
ldrs.ordersByCustomer = &OrderSliceLoader{
wait: wait,
maxBatch: 100,
fetch: func(keys []int) ([][]Order, []error) {
var keySql []string
for _, key := range keys {
keySql = append(keySql, strconv.Itoa(key))
}
fmt.Printf("SELECT * FROM orders WHERE customer_id IN (%s)\n", strings.Join(keySql, ","))
time.Sleep(5 * time.Millisecond)
// ...
return orders, errors
},
}
Could ordersByCustomer
become something like this :
// 1:M loader
ldrs.ordersByCustomer = &OrderSliceLoader{
wait: wait,
maxBatch: 100,
fetch: func(keys []int, last int, from Date) ([][]Order, []error) { // <-- HERE
var keySql []string
for _, key := range keys {
keySql = append(keySql, strconv.Itoa(key))
}
fmt.Printf("SELECT * FROM orders WHERE customer_id IN (%s) AND WHERE created_at > (%s) LIMIT (%d)", strings.Join(keySql, ","), from, last) // <-- HERE
time.Sleep(5 * time.Millisecond)
// ...
return orders, errors
},
}
Or do I need to hack something using the example with context ?
Regards,
David.
I noticed that a lot of examples seem to construct new instances in the handlers. I would prefer to create my dataloaders while initialising the server.
This way I can avoid using context.
Can this be done? Can I embed a single instance of the dataloaders into the resolver struct to be reused across requests or does it need to be created every time? Will it keep track of what requests requested which IDs?
It would be nice if you could a flag to change the name of the generated loader.
e.g. dataloaden -keys string github.com/somestuff/user.Configuration -name UserConfigurationLoader
I have simple structs:
package models
type Article struct {
ID int `json:"id"`
Title string `json:"title"`
Authors []*Author `json:"authors"`
}
type Author struct {
ID int `json:"id"`
FullName string `json:"fullName"`
}
I get an error when trying to generate dataloaders:
go run github.com/vektah/dataloaden AuthorLoader int *git.rn/gm/service-articles/internal/models.Author
unable to gofmt: /home/pioneer/sources/gm/service-articles-go/internal/dataloaders/authorloader_gen.go:6:1: expected 'IDENT', found 'import'
exit status 2
What I'm doing wrong?
package generator
import "text/template"
var tpl = template.Must(template.New("generated").
Funcs(template.FuncMap{
"lcFirst": lcFirst,
}).
Parse(`
// Code generated by github.com/vektah/dataloaden, DO NOT EDIT.
package {{.Package}}
import (
"context"
"sync"
"time"
{{if .KeyType.ImportPath}}"{{.KeyType.ImportPath}}"{{end}}
{{if .ValType.ImportPath}}"{{.ValType.ImportPath}}"{{end}}
)
// {{.Name}}Config captures the config to create a new {{.Name}}
type {{.Name}}Config struct {
// Fetch is a method that provides the data for the loader
Fetch func(ctx context.Context, keys []{{.KeyType.String}}) ([]{{.ValType.String}}, []error)
// Wait is how long wait before sending a batch
Wait time.Duration
// MaxBatch will limit the maximum number of keys to send in one batch, 0 = not limit
MaxBatch int
}
// New{{.Name}} creates a new {{.Name}} given a fetch, wait, and maxBatch
func New{{.Name}}(config {{.Name}}Config) *{{.Name}} {
return &{{.Name}}{
fetch: config.Fetch,
wait: config.Wait,
maxBatch: config.MaxBatch,
}
}
// {{.Name}} batches and caches requests
type {{.Name}} struct {
// this method provides the data for the loader
fetch func(ctx context.Context, keys []{{.KeyType.String}}) ([]{{.ValType.String}}, []error)
// how long to done before sending a batch
wait time.Duration
// this will limit the maximum number of keys to send in one batch, 0 = no limit
maxBatch int
// INTERNAL
// lazily created cache
cache map[{{.KeyType.String}}]{{.ValType.String}}
// the current batch. keys will continue to be collected until timeout is hit,
// then everything will be sent to the fetch method and out to the listeners
batch *{{.Name|lcFirst}}Batch
// mutex to prevent races
mu sync.Mutex
}
type {{.Name|lcFirst}}Batch struct {
keys []{{.KeyType}}
data []{{.ValType.String}}
error []error
closing bool
done chan struct{}
}
// Load a {{.ValType.Name}} by key, batching and caching will be applied automatically
func (l *{{.Name}}) Load(ctx context.Context, key {{.KeyType.String}}) ({{.ValType.String}}, error) {
return l.LoadThunk(ctx, key)()
}
// LoadThunk returns a function that when called will block waiting for a {{.ValType.Name}}.
// This method should be used if you want one goroutine to make requests to many
// different data loaders without blocking until the thunk is called.
func (l *{{.Name}}) LoadThunk(ctx context.Context, key {{.KeyType.String}}) func() ({{.ValType.String}}, error) {
l.mu.Lock()
if it, ok := l.cache[key]; ok {
l.mu.Unlock()
return func() ({{.ValType.String}}, error) {
return it, nil
}
}
if l.batch == nil {
l.batch = &{{.Name|lcFirst}}Batch{done: make(chan struct{})}
}
batch := l.batch
pos := batch.keyIndex(ctx, l, key)
l.mu.Unlock()
return func() ({{.ValType.String}}, error) {
<-batch.done
var data {{.ValType.String}}
if pos < len(batch.data) {
data = batch.data[pos]
}
var err error
// its convenient to be able to return a single error for everything
if len(batch.error) == 1 {
err = batch.error[0]
} else if batch.error != nil {
err = batch.error[pos]
}
if err == nil {
l.mu.Lock()
l.unsafeSet(key, data)
l.mu.Unlock()
}
return data, err
}
}
// LoadAll fetches many keys at once. It will be broken into appropriate sized
// sub batches depending on how the loader is configured
func (l *{{.Name}}) LoadAll(ctx context.Context, keys []{{.KeyType}}) ([]{{.ValType.String}}, []error) {
results := make([]func() ({{.ValType.String}}, error), len(keys))
for i, key := range keys {
results[i] = l.LoadThunk(ctx, key)
}
{{.ValType.Name|lcFirst}}s := make([]{{.ValType.String}}, len(keys))
errors := make([]error, len(keys))
for i, thunk := range results {
{{.ValType.Name|lcFirst}}s[i], errors[i] = thunk()
}
return {{.ValType.Name|lcFirst}}s, errors
}
// LoadAllThunk returns a function that when called will block waiting for a {{.ValType.Name}}s.
// This method should be used if you want one goroutine to make requests to many
// different data loaders without blocking until the thunk is called.
func (l *{{.Name}}) LoadAllThunk(ctx context.Context, keys []{{.KeyType}}) (func() ([]{{.ValType.String}}, []error)) {
results := make([]func() ({{.ValType.String}}, error), len(keys))
for i, key := range keys {
results[i] = l.LoadThunk(ctx, key)
}
return func() ([]{{.ValType.String}}, []error) {
{{.ValType.Name|lcFirst}}s := make([]{{.ValType.String}}, len(keys))
errors := make([]error, len(keys))
for i, thunk := range results {
{{.ValType.Name|lcFirst}}s[i], errors[i] = thunk()
}
return {{.ValType.Name|lcFirst}}s, errors
}
}
// Prime the cache with the provided key and value. If the key already exists, no change is made
// and false is returned.
// (To forcefully prime the cache, clear the key first with loader.clear(key).prime(key, value).)
func (l *{{.Name}}) Prime(key {{.KeyType}}, value {{.ValType.String}}) bool {
l.mu.Lock()
var found bool
if _, found = l.cache[key]; !found {
{{- if .ValType.IsPtr }}
// make a copy when writing to the cache, its easy to pass a pointer in from a loop var
// and end up with the whole cache pointing to the same value.
cpy := *value
l.unsafeSet(key, &cpy)
{{- else if .ValType.IsSlice }}
// make a copy when writing to the cache, its easy to pass a pointer in from a loop var
// and end up with the whole cache pointing to the same value.
cpy := make({{.ValType.String}}, len(value))
copy(cpy, value)
l.unsafeSet(key, cpy)
{{- else }}
l.unsafeSet(key, value)
{{- end }}
}
l.mu.Unlock()
return !found
}
// Clear the value at key from the cache, if it exists
func (l *{{.Name}}) Clear(key {{.KeyType}}) {
l.mu.Lock()
delete(l.cache, key)
l.mu.Unlock()
}
func (l *{{.Name}}) unsafeSet(key {{.KeyType}}, value {{.ValType.String}}) {
if l.cache == nil {
l.cache = map[{{.KeyType}}]{{.ValType.String}}{}
}
l.cache[key] = value
}
// keyIndex will return the location of the key in the batch, if its not found
// it will add the key to the batch
func (b *{{.Name|lcFirst}}Batch) keyIndex(ctx context.Context, l *{{.Name}}, key {{.KeyType}}) int {
for i, existingKey := range b.keys {
if key == existingKey {
return i
}
}
pos := len(b.keys)
b.keys = append(b.keys, key)
if pos == 0 {
go b.startTimer(ctx, l)
}
if l.maxBatch != 0 && pos >= l.maxBatch-1 {
if !b.closing {
b.closing = true
l.batch = nil
go b.end(ctx, l)
}
}
return pos
}
func (b *{{.Name|lcFirst}}Batch) startTimer(ctx context.Context, l *{{.Name}}) {
time.Sleep(l.wait)
l.mu.Lock()
// we must have hit a batch limit and are already finalizing this batch
if b.closing {
l.mu.Unlock()
return
}
l.batch = nil
l.mu.Unlock()
b.end(ctx, l)
}
func (b *{{.Name|lcFirst}}Batch) end(ctx context.Context, l *{{.Name}}) {
b.data, b.error = l.fetch(ctx, b.keys)
close(b.done)
}
`))
I'm looking for a pattern or a recipe to use dataloaden with gin is there some approach?
There is feature to cleanup cache on timer or one way to clear is explicitly call Clear func on loader?
If there is one way what you think about add cleaning cache by timer functionality?
Hi,
I wanted to double check that if the order of results returned from the database matter. For example postgres doesn't guarantee the order of results when doing an IN
query and looking through the generated code it looks like it expects the results to match the order of the keys it sends.
If it does, any thoughts on what the best practice should be here. Maybe the fetch function should return a map[keys]values instead of a slice?
Thanks!
The target is to pass field collection context GetOperationContext
into the Dataloader. In the resolvers we are passing the context into the retriever but it doesn't contain the the info that we need to get the fields of the query on that level. So I tried to change that on the Middleware layer but without hope. Do you know if there is a way to pass that context on the Dataloaders ?
FYI the error happen when I try to pass the field collection of the nested model in Dataloader, panic missing operation context
When I was passing the context in the resolver I was expecting that I will have all the info to get the fields and use GetOperationContext to get them and pass them in Dataloader.
rootQuery(
name: ["Alex"]) {
name
email
tel
ProductsInfo{
id
name
description
}
}
}
go run github.com/vektah/dataloaden ObjectLoader string map[string]interface{}
unable to gofmt: [hidden]/dataloader/someloader_gen.go:125:2: expected expression (and 8 more errors)
exit status 2
Is this supported? Or is only a slice supported?
Hi, I have some cases that I need the context... For example I'm using the library to make external requests to other APIs and each logged in user have their access key (which is in the context) and there's no way I can get it inside the fetch function.
My users are being set using GraphQL directives because some resolvers don't require users to be logged in
Can we add that?
Is there any potential to separating out the main package from the actual package code so that one could create their own custom binary by importing the package and not having to run go get
?
"wait" and "maxBatch" are lowercased which doesn't allow to package _gen files together and have the middleware in another package .
Fields that start with lower case characters are package internal and not exposed, to reference the field from another package it needs to start with an upper case character
I've so far been using graphql in node and have used the facebook dataloader quite a bit. I have some go services that I want to add a graphql api to. Gqlgen seems quite legit and I've read good things about it so far. However the way this dataloader works doesn't feel right. The facebook dataloader in js doesn't have such a weird implementation where you have to wait an arbitrary amount of time to resolve the data. Did you have trouble finding a solution in go that doesn't involve timeouts to know when to resolve the data? I would like to explore this a bit as it's currently preventing me from deciding to use this library. Can you maybe give some info on what you tried and how you decided that timeouts is the best solution? I think it would be good to include this in the readme, as I'm sure I'm not the only one that has concerns about this.
Generate a dataloader with target type time.Time. Example given: //go:generate dataloaden fooLoader int *time.Time
Generating fails with:
โ go-test go generate ./... 16:31:36
validation failed: packages.Load: /home/vanjiii/dev/src/junk/go-test/fooloader_gen.go:9:2: time redeclared in this block
/home/vanjiii/dev/src/junk/go-test/fooloader_gen.go:7:2: other declaration of time
exit status 1
main.go:10: running "go": exit status 1
The generated file fooloader_gen.go
// Code generated by github.com/vektah/dataloaden, DO NOT EDIT.
package main
import (
"sync"
"time"
"time"
)
// rest of file...
The generation to complete.
type Time struct {
time.Time
}
//go:generate dataloaden fooLoader int *Time
For example, if you request with GitHub GraphQL API with the following query, the result will be returned considering first and after.
{
viewer {
repositories(first: 30) {
nodes {
issues(first: 30, after: "Y3Vyc29yOnYyOpHOAp96sw==") {
nodes {
title
}
}
}
}
}
}
Probably, if I implement something similar api with gqlgen, I should use dataloader for issues
.
However I can only pass keys to dataloader.
How should I pass first, after, etc. information?
This is almost the same issue reported in #30.
dataloaden retrieves a package name in the current working directory when generating a file.
But, if no packages are found in the working directory, the generated file has no (empty) package name and is broken.
I got the error below (same as #30) when I tried to generate in a newly created directory.
$ pwd
/path/to/example.com
$ mkdir dataloader; cd dataloader
$ go run github.com/vektah/dataloaden UserLoader uint64 *example.com/model.User
unable to gofmt:/path/to/example.com/dataloader/userloader_gen.go:6:1: expected 'IDENT', found 'import'
exit status 2
It would be very useful to have (maybe generate-time optional) tracing hooks in the generated code. It's very difficult to reason about data loader performance and tracing is the first step to doing it.
I see no recent commits. Is it reliable to use this project in production?
I upgraded the dataloaden version to latest v0.3.0, it starts to show following error.
$ dataloaden UserSliceLoader int []*github.com/vektah/dataloaden/example.User
zsh: no matches found: []*github.com/vektah/dataloaden/example.User
I don't know this is my environment problem, but I succeeded to generate dataloader for normal struct and other self defined types.
$ dataloaden UserLoader int github.com/vektah/dataloaden/example.User
Now I avoid this error by defining new type of slice, and slice of pointers, like below.
type UserSlice []User
type UserPointerSlice []*User
type UserPointer *User
My environment is MacOS, zsh, go 1.12.5.
What is the best approach if I need to pass arguments to my dataloader?
I've seen 2 issues on here where people suggest using anonymous structs but that feels pretty janky for a few reasons.
I could also see a solution of just throwing it into context but that also doesn't feel right because we lose type safety.
Is there a better officially supported approach?
We should be able to change the name of the generated file.
https://github.com/vektah/dataloaden/blob/master/pkg/generator/generator.go#L87
So every single data loader usage will have a mapping thing going on inside, it's basically:
I feel like since this is exactly the same for every single data loader which deals with db, why not add a flag which will include this as well?
For some members of our team re-generation removes parts that have to do with config, for example:
// MediaLoaderConfig captures the config to create a new MediaLoader
type MediaLoaderConfig struct {
// Fetch is a method that provides the data for the loader
Fetch func(keys []uint64) ([]*graphql.Media, []error)
// Wait is how long wait before sending a batch
Wait time.Duration
// MaxBatch will limit the maximum number of keys to send in one batch, 0 = not limit
MaxBatch int
}
// NewMediaLoader creates a new MediaLoader given a fetch, wait, and maxBatch
func NewMediaLoader(config MediaLoaderConfig) *MediaLoader {
return &MediaLoader{
fetch: config.Fetch,
wait: config.Wait,
maxBatch: config.MaxBatch,
}
}
Do you have any idea why this might be happening?
I have these entities:
type Player struct {
ID int
CreatedAt time.Time
City City
CityID int
Team *Team
TeamID *int
Score int
}
type Player {
id: ID!
createdAt: Time!
City: City!
Team: Team
Score: Int!
}
As you can see:
City
is mandatory
Team
(and TeamID
in Golang struct) is NOT mandatory, so I'm using a pointer type (*
)
What I don't understand now is how to use dataloaden
(a GraphQL dataloader).
Here the generated code for CityById
loader, generated with: go run github.com/vektah/dataloaden CityByIdLoader int *my_project/entities.City
:
CityById: CityByIdLoader{
maxBatch: 100,
wait: 1 * time.Millisecond,
fetch: func(keys []int) ([]*entities.City, []error) {
cities, err := myRepo.CitiesByKeys(keys)
// directly ordered in DB based on keys` order: it works
return cities, []error{err}
},
},
What about []*int
for TeamID
?
Here's the generated code for TeamById
loader, generated with: go run github.com/vektah/dataloaden TeamByIdLoader *int *my_project/entities.Team
:
TeamById: TeamByIdLoader{
maxBatch: 100,
wait: 1 * time.Millisecond,
fetch: func(keys []int) ([]*entities.Team, []error) {
teams, err := myRepo.TeamsByKeys(keys)
// I cannot order in DB because of `NULL` keys, right?
// How to order these results?
return teams, []error{err}
},
},
I don't understand how to re-order teams by keys if I cannot do it in DB because of NULL
values in keys
.
I was just wondering why this depends on the 0.0.0 version of golang.org/x/tools and not something more recent?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.