Giter VIP home page Giter VIP logo

Comments (10)

Cito avatar Cito commented on May 18, 2024

My idea would be to port https://github.com/graphql/dataloader to Python, similarly to how GraphQL.js has been ported - it seems Syrus already started to work on this (https://github.com/syrusakbary/aiodataloader). Dataloader has seen a lot of new development recently, so this could be worthwile.

from graphql-core.

jnak avatar jnak commented on May 18, 2024

Nice! I didn't realize Syrus had made another version of Dataloader specifically for asyncio. He seems he's indeed using loop.call_soon to schedule batching on the next loop cycle.

Re the recent developments of Dataloader in JS, the task scheduling in Dataloader v2 uses the same underlying mechanism as in v1. Since Syrus' repo is a straight port from Dataloader JS v1, let's use it as a basis for the purpose of this conversation.

I see 2 issues to use it in a multi-threaded mode:

  1. It is not thread safe because the Dataloader queue is not thread-scoped. Scheduled coroutines could be moved to a different thread, leading to all sort of bugs and security issues. We could fix that by subclassing threading.local (vs object).

  2. It assumes it's run inside a single global event loop. We can remove the assumption that there is a single loop by not caching the value of get_event_loop(), but this will still assume this is run inside an event loop.

Unless we wrap each graphql call into its own event loop (see my snippet), I don't see how we can enable batching in multi-threaded mode. Do you have a sense if that's a crazy idea?

from graphql-core.

Cito avatar Cito commented on May 18, 2024

Currently I don't have enough time to look deeper into this and make a qualified assessment. Maybe you should examine what they are doing in other languages, e.g. graphql/dataloader#30.

from graphql-core.

jnak avatar jnak commented on May 18, 2024

Sounds good. I look into it when I start working on upgrading to v3 (probably not before Q2/Q3 2020. In the meantime, let's keep this issue open in case someone needs this earlier than I do so we can discuss it here.

If someone is interested in working on this, refer to syrusakbary/promise#81 to learn more about Dataloader and thread-safety in Python.

from graphql-core.

silas avatar silas commented on May 18, 2024

Looks like it might be possible to use asgiref's sync_to_async and aiodataloader to do this now.

import asyncio
import os
import time
os.environ['ASGI_THREADS'] = '16'

from aiodataloader import DataLoader
from asgiref.sync import sync_to_async


class TestLoader(DataLoader):
    @sync_to_async
    def batch_load_fn(self, keys):
        t = time.time()
        return [f'{key}-{t}' for key in keys]


async def main():
    loader = TestLoader()
    one1 = loader.load('one')
    one2 = loader.load('one')
    two = loader.load('two')
    print(await one1, await one2)
    print(await two)


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
    loop.close()

from graphql-core.

jnak avatar jnak commented on May 18, 2024

@silas I'm not sure why you need asgiref's sync_to_async here since aiodataloader.DataLoader.batch_load_fn is already async. Also aiodataloader.DataLoader is not thread-safe, so you'll run into concurrency issues if you call / await load from multiple threads.

If you want to make this work in a multi-threaded environment, I recommend that you make aiodataloader.DataLoader thread-safe by using threading.local (see syrusakbary/promise#81 for an example). Then you should be able to safely wrap the graphql call with my run_batched_query snippet above.

from graphql-core.

silas avatar silas commented on May 18, 2024

@jhgg batch_load_fn is a sync function in this case, which aiodataloader.DataLoader doesn't support (as far as I can tell).

You can't call load from the sync code (because it's an async function). You could use asgiref's async_to_sync which seems to hoist the call back up to the async thread (haven't confirmed), which should make it safe.

That said, I never call load from batch_load_fn or any of it's dependents, so it's fine for my use case.

Probably not a general solution, but it seems like it's possible to make it work now.

from graphql-core.

miracle2k avatar miracle2k commented on May 18, 2024

Why isn't the promise library supported anymore - or an equivalent? I want to use the dataloader pattern in a sync application. It's not even multi-threaded. All I want is to return an object from my resolver which will be resolved in a second tick, after the initial pass through execute. The promise library seemed to handle this use case nicely.

from graphql-core.

Cito avatar Cito commented on May 18, 2024

@miracle2k The async feature is a modern equivalent to promises. It's the modern and official solution to the problem and supported by both Python syntax and standard library. Promises are not contained in the standard library.

from graphql-core.

miracle2k avatar miracle2k commented on May 18, 2024

@Cito Ok, but I don't want to use async in this code base, but I do want to use the data loader. I think this was a reasonable thing to have been supported previously.

So I am aware that I can do the following, and indeed this is what I am doing now and it works:

  • Use the aiodataloader library. Implement the async def batch_load_fn function, but ignore the fact that it is async, and simply do a blocking call.
  • Do blocking calls to database and network in all other resolvers of the schema as well.
  • When calling into graphql to resolve a query, wrap the whole thing in an asyncio loop:
    import asyncio
    loop = asyncio.new_event_loop()
    async def run_async():
        return await execute(
            schema,
            document,
            variable_values=params.variables,
            operation_name=params.operation_name,
            **kwargs,
        )

    return loop.run_until_complete(run_async())

Now we are using asyncio as the "promise execution engine" for dataloader only, and otherwise the code is synchronous. I don't now how high the performance penalty is, but would guess it's minimal / similar to what it was with the promise library. It feels pretty gross though? If this is the recommend way to do the dataloader pattern in sync mode, ok.

Note for anyone attempting this: In my particular case this required changes to aiodataloader so it doesn't require an event loop to exist at __init__ time. but this can be avoided by simply moving the asyncio event loop higher up in the call stack.

from graphql-core.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.