I've run a very simplistic benchmark, just returning a long list of single-field objec

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Performance of isawaitable,about graphql-python/graphql-core

Comments (27)

Cito commented on July 29, 2024 6

@qeternity This issue tracker only covers GraphQL-core v3. And please understand that these are all projects that various people work on for free in their spare time. You can't expect people to work on issues - they may have currently other priorities in life. The only way to move forward is to contribute by sending PRs, become a maintainer of stalled projects and motivating other people.

from graphql-core.

ktosiek commented on July 29, 2024 3

@Hellzed wow, I've confirmed your tests. There's a big difference between 3.6 and 3.7, but 3.8 is even faster!

I've looked at the test_execute_introspection_query benchmark, and it went from ~756ms on 3.6 to ~265ms on 3.8. This is so huge isawaitable() might not be the bottleneck anymore, I'll need new flamegraphs now :-)

Benchmark results:

version	time
Python 3.8.0 (default, Nov 21 2019, 20:31:38)	370 nsec per loop
Python 3.7.5rc1 (default, Oct 8 2019, 16:47:45)	444 nsec per loop
Python 3.6.9 (default, Nov 21 2019, 21:00:04)	1.07 usec per loop

The benchmark:

$ python3 -VV
Python 3.6.9 (default, Nov 21 2019, 21:00:04) 
[GCC 9.2.1 20191008]
$ python3 -mtimeit -s $'from inspect import isawaitable \nclass A: pass \na = A()' 'isawaitable(a)'
1000000 loops, best of 3: 1.07 usec per loop

from graphql-core.

Hellzed commented on July 29, 2024 2

@ktosiek I used stock Ubuntu 19.10 python packages, so I'm not sure which compile flags were used.

If you intend to make a GraphQL service built on graphql-core-3 faster, I would recommend to target the parsing and validation steps, and make sure your clients always use variables for any changing value in the query.

This means instead of using graphql.graphql, you should use graphql.parse, graphql.validate then only graphql.execute separately, and introduce a caching step (with Redis or memcache for example) in between. This would also allow you to leverage libgraphqlparser if you're comfortable with CFFI or cTypes.

from graphql-core.

Cito commented on July 29, 2024 2

I have now implemented the ideas mentioned above - this improves the performance considerably. The benchmark with v3.1.0b2 is now running faster than with v2. Please let me know how it works for you and whether we can close this issue.

from graphql-core.

ktosiek commented on July 29, 2024 2

I can confirm it runs faster than v2, tested on Python 3.7 and 3.8. I think this issue is solved. Thank you, this will help a lot with migrating to v3!

from graphql-core.

Cito commented on July 29, 2024 1

So far the focus has been on compatibility, correctness, and being up to date with GraphQL.js.

The next step will be to improve performance. There is probably a lot of room for improvement. I'd like to keep the code simple and very close to the GraphQL.js implementation though, in order to make it easier to stay in sync with changes over there, and not move us into a very performant, but inscrutable and unmaintainable dead end.

Maybe we can kind of inline some of the isawaitables() or split some of the code paths for sync/async to avoid such checks.

But before we start the optimization, we need good benchmarks tools. Your script would be a starting point. My idea is to port the existing benchmarks from GraphQL.js and maybe add some more Python specific ones if needed (see issue #3). I think it makes sense t to work on that first, and at least have some basic performance checks before changing the code.

It's on my list, just need some free time to work on this. Feel free to send contributions as PRs as well.

from graphql-core.

Cito commented on July 29, 2024 1

Update: We now have nice benchmarks and can continue working on performance.

from graphql-core.

Cito commented on July 29, 2024 1

While I said JS and Python are not that different, there is one difference that plays a role here, namely that in JS you can await any expression, but in Python you can await only awaitables. That's why we need so many isawaitable calls and have nested async functions which is not very efficient. Maybe we can find a better solution.

from graphql-core.

ewhauser commented on July 29, 2024 1

Caching the parsing and validation definitely helps, but the bulk of the time (especially for larger objects) is spent in execution. The best approach would be to introduce either an AOT or JIT compiler like https://github.com/zalando-incubator/graphql-jit that generated runtime code for optimal execution.

from graphql-core.

Cito commented on July 29, 2024 1

Another idea: Maybe we should add an isawaitable parameter to the execute function. By default we could use the simplified, faster isawaitable check mentioned above, and maybe it could even be set to a function that always returns False when execute is called by the graphql_sync function to make that path even faster.

from graphql-core.

qeternity commented on July 29, 2024 1

@Cito Just had a look over the commit...nice, looks great! Not had a chance to run it yet, but will sit down with it tonight. I already had put aside some weekend quarantine time to start implementing some more aggressive hotpaths.

from graphql-core.

qeternity commented on July 29, 2024

I'd like to keep the code simple and very close to the GraphQL.js implementation though, in order to make it easier to stay in sync with changes over there, and not move us into a very performant, but inscrutable and unmaintainable dead end.

Nice job so far on this project, thanks for all your efforts. I worry that keeping the implementation similar between two very different languages and interpreters is going to keep this project from being scalable. We're using this for a new project, and are considering returning to DRF because for relatively small calls we're struggling to get latencies below 300ms (profiling shows nearly all of the time is spent inside graphql-core). PyPy reduces this by about half, but that means backporting a few repos to 3.6 which I did to test out, but am not keen to maintain especially with 3.8 just around the corner.

from graphql-core.

Cito commented on July 29, 2024

If it really makes and big difference necessary we certainly can implement some parts that are crucial regarding performance differently. But actually JS and Python are not that different, and as I wrote above, I believe there is still a lot of room for improvement while keeping the implementation similar. Anybody who wants to work on this: Feel free to send PRs and suggestions.

from graphql-core.

qeternity commented on July 29, 2024

Thanks @Cito indeed my point about Python vs. JS was that Python has been sync from day one with async bolted on whereas JS has been the other way around. As @ktosiek has done a very good job demonstrating, the isawaitable calls are very expensive and are only necessary due to the differences you mention.

from graphql-core.

Hellzed commented on July 29, 2024

@ktosiek Which versions of Python have you tried this on? I found huge improvements from Python 3.6 to Python 3.8 (it seems like 3.8 isawaitable() has almost 300% the performance of 3.6).

from graphql-core.

ktosiek commented on July 29, 2024

Thanks for the heads up, I'll try on 3.8 too. I've run my experiments on 3.6 and 3.7. Are you sure your 3.6 was compiled with PGO? It made a huge difference for me when I've installed 3.7 from pyenv (which disabled PGO to get better compilation times). śr., 20 lis 2019, 23:03 użytkownik Loris Zinsou <[email protected]> napisał:

…

@ktosiek <https://github.com/ktosiek> Which versions of Python have you tried this on? I found huge improvements from Python 3.6 to Python 3.8 (it seems like 3.8 isawaitable() has almost 300% the performance of 3.6). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#54>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACD37MRKKD3SBJ2XREOPSDQUWXZVANCNFSM4ITOIKPQ> .

from graphql-core.

ktosiek commented on July 29, 2024

@Hellzed this issue started as a result of digging into cases like graphql-python/graphene#268 - with many objects returned by one API call. Caching the parsed query would help, but the bulk of the time is spent in execution.

from graphql-core.

Hellzed commented on July 29, 2024

@ktosiek I actually wondered about the use of isawaitable() when I saw it comes from the inspect module, and remembered some old advice: "don't use inspect, it's slow". Never needed to verify that before.

How does Python 3.8 fare when running both Graphene v2 and Graphene v3 benchmarks? Does it just make both faster, or does it bring v3 closer to v2?

from graphql-core.

qeternity commented on July 29, 2024

I've just run this simple benchmark against latest v2 and v3 branches and we've still had a performance regression of nearly 30% (18.5 sec vs 24.3 sec)

from graphql-core.

Cito commented on July 29, 2024

@qeternity Can confirm it's currently still 30% even with Python 3.8.

I can also confirm that the main problem are still the isawaitable calls, particularly in the complete_value_catching_error method. And in the isawaitable calls, the isinstance check for collections.abc.Awaitable is the biggest time waster.

When I replaced that isinstance check in the inspect module with a simple hasattr(object, '__await__'), the benchmark ran faster than with the v2 branch.

Maybe we can live with such a "simplified" isawaitable method? It's currently unclear to me, why and when it does not suffice to simply check for the __await__ attribute so that an instance check against Abc.Awaitable would be necessary. Any Python gurus here who can shed light on this, or who has some time to discuss this on the Python mailing list and report back?

Note: We must keep in mind that the above benchmark is only testing the path where all resolvers are synchronous. If we start to optimize, we must also measure with a benchmark with asynchronous code in order to not deteriorate the performance for that case.

from graphql-core.

qeternity commented on July 29, 2024

There's also this which I opened some time ago only to fall on deaf ears: syrusakbary/promise#82

from graphql-core.

qeternity commented on July 29, 2024

@Cito thanks, I absolutely don't expect anything. I have spent an incredible amount of time profiling and benchmarking this. With respect to scope of issue tracker, the OP benchmark is graphene, not graphql-core. I wanted to make sure we were all aware of performance regression in libraries that graphene depends on so as not to chase ghosts in this issue tracker.

from graphql-core.

Cito commented on July 29, 2024

@qeternity Graphene is only a wrapper around the GraphQL-core types. The query execution and the heavy lifting happens in GraphQL-core, so that's in fact the right place to optimize execution performance. The promise library mentioned above is only used in GraphQL-core v2 (Graphene v2) which is considered deprecated, that's why you don't see much activity there any more.

from graphql-core.

qeternity commented on July 29, 2024

@Cito yep, I'm aware. But if this delta between graphene v2 and v3 is closing (which is ultimately what we're benchmarking here) due to regressions in graphene v2 (vis-a-vis Promises regressions) instead of improvements in v3, I presumed that was relevant to our discussion. But your point is well taken.

from graphql-core.

Cito commented on July 29, 2024

@qeternity Ah ok now I understand your point. You were actually saying you observed a v2 regression in the same benchmark we are discussing here, and since we are comparing with v2 as our base line, we should be aware of that. That's in fact relevant to our discussion.

It's interesting that you observed an update of the promises lib to affect this (synchronous) benchmark. I had a quick look - the main thing that has been changed in the promises lib update is something with caching, fixing an out-of-memory error. Maybe it's just slower because of the additional garbage collection. Benchmarking is not easy. This benchmark only measures the time in this one special case, not "the" performance in general.

from graphql-core.

ewhauser commented on July 29, 2024

Just to add on here - when using async, the performance gains appear to be quite large due to the elimination of promises. This example highlights the performance gains pretty as it is returning a large of number of results (5x iterations).

Here is graphl-core/graphene 2:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      210    0.407    0.002   49.608    0.236 /usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py:1675(_run_once)
   180304    0.208    0.000   46.979    0.000 /usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/events.py:86(_run)
   180304    0.329    0.000   46.771    0.000 {method 'run' of 'Context' objects}
   180060    0.097    0.000   25.413    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/promise.py:478(resolve)
315125/180065    0.564    0.000   25.316    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/promise.py:189(_resolve_callback)
    90035    0.084    0.000   25.052    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/promise.py:841(handle_future_result)
405130/180050    0.625    0.000   24.412    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/promise.py:229(_fulfill)
180045/90030    0.107    0.000   24.067    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/async_.py:54(settle_promises)
180045/90030    0.219    0.000   24.012    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/async_.py:36(_async_settle_promise)
180050/90035    0.115    0.000   23.858    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/async_.py:127(queue_tick)
    90035    0.045    0.000   23.766    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/schedulers/immediate.py:9(call)
    90035    0.175    0.000   23.722    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/async_.py:119(drain_queues)
   180070    0.661    0.000   23.516    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/async_.py:71(drain_queue)
315105/180045    0.403    0.000   22.259    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/promise.py:457(_settle_promises)
315105/180045    0.177    0.000   21.887    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/promise.py:334(_settle_promise0)
315110/180050    0.473    0.000   21.781    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/promise.py:306(_settle_promise)
135065/105040    0.162    0.000   21.180    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/promise.py:345(_settle_promise_from_handler)
   135065    0.111    0.000   19.625    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/promise/promise.py:84(try_catch)
    90025    0.085    0.000   19.430    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/executor.py:531(<lambda>)
225065/90030    0.863    0.000   19.345    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/executor.py:500(complete_value)
    45010    0.061    0.000   17.934    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/executor.py:700(complete_object_value)
    45015    0.265    0.000   17.727    0.000 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/executor.py:230(execute_fields)
       10    0.000    0.000   17.025    1.703 
... user code ...

vs 3:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      175    0.049    0.000   15.466    0.088 /usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py:1675(_run_once)
      270    0.000    0.000   14.621    0.054 /usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/events.py:86(_run)
      270    0.001    0.000   14.621    0.054 {method 'run' of 'Context' objects}
... user code ...
 45015/10    0.141    0.000    2.268    0.227 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/execute.py:361(execute_fields)
 90025/15    0.223    0.000    2.268    0.151 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/execute.py:522(resolve_field)
       10    0.000    0.000    2.268    0.227 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/execute.py:630(await_completed)
       10    0.000    0.000    2.268    0.227 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/execute.py:615(await_result)
 135040/5    0.260    0.000    2.267    0.453 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/execute.py:660(complete_value)
  45010/5    0.024    0.000    2.267    0.453 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/execute.py:899(complete_object_value)
  45010/5    0.045    0.000    2.267    0.453 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/execute.py:934(collect_and_execute_subfields)
105025/15    0.094    0.000    2.267    0.151 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/execute.py:599(complete_value_catching_error)
        5    0.024    0.005    2.266    0.453 /Users/ewhauser/.venvs/drone_web/lib/python3.7/site-packages/graphql/execution/execute.py:744(complete_list_value)

It does appear that document validation is 2x slower in the newer version of core, but I'll open a separate issue. Nice work!

from graphql-core.

Cito commented on July 29, 2024

@ewhauser Thanks for the feedback. Yes, it would be good to open a new issue for performance improvements in validation. Maybe you can run a profiler and identify some bottlenecks already.

from graphql-core.

Performance of isawaitable about graphql-core HOT 27 CLOSED

Comments (27)

Benchmark results:

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent