Giter VIP home page Giter VIP logo

Comments (14)

Dreamsorcerer avatar Dreamsorcerer commented on May 25, 2024

See https://docs.aiohttp.org/en/stable/web_advanced.html#graceful-shutdown

My suggestion is that long-lived tasks should be created independently, as per: https://docs.aiohttp.org/en/stable/web_advanced.html#complex-applications
Any task created before the application starts serving requests will be ignored from the wait step and not hold the application open.

I'd use aiojobs for tasks that are expected to run for a short period of time in the background.

from aiohttp.

novoxd avatar novoxd commented on May 25, 2024

@Dreamsorcerer thank you for reply, totally makes sense.
But it seems like there is a bug or I am doing something wrong:

import asyncio
import time

from typing import AsyncIterator

from aiohttp import web
from aiojobs.aiohttp import spawn, AIOJOBS_SCHEDULER
from aiojobs import Scheduler

import logging
logging.basicConfig(level=logging.DEBUG)

async def back_task():
    st = time.time()
    while True:
        print(f'I am running for {int(time.time() - st)} seconds')
        try:
            await asyncio.sleep(1)
        except asyncio.CancelledError:
            print(f'After {int(time.time() - st)} seconds finished by asyncio cancel')
            return

async def handler(request):
    await spawn(request, back_task())
    return web.Response(text="ok")


scheduler: Scheduler = None
async def create_scheduler():
    global scheduler
    scheduler = Scheduler()
asyncio.run(create_scheduler())

aiohttp_app = web.Application()

async def cleanup_context(app: web.Application) -> AsyncIterator[None]:
    app[AIOJOBS_SCHEDULER] = scheduler
    yield
    await scheduler.close()
aiohttp_app.cleanup_ctx.append(cleanup_context)

aiohttp_app.router.add_get('/', handler)
web.run_app(aiohttp_app)

The output is same, it will exit only after 60s timeout. Moreover I don't like using global variable to do that.
I am wondering what if aiohttp had 3 steps cleanup context logic. 2nd step will be triggered on stop signal so all bg tasks can terminate successfully, something like:

async def run_other_task(_app):
    task = asyncio.create_task(other_long_task())
    yield
    task.cancel() # or more complected logic
    yield
    with suppress(asyncio.CancelledError):
        await task  # Ensure any exceptions etc. are raised.

app.cleanup_3stage_ctx.append(run_other_task)

I'd use aiojobs for tasks that are expected to run for a short period of time in the background.

Makes sense. What is short period of time and long? That depends on. In my case I have a background task that can run for 5-10 minutes and will be triggered by incoming request. It has cancel logic to exit gratefully. So I don't see any restrictions to use iojobs for it.

--
In addition for others want to point that this issue is caused not because there is asyncio.Task from back_task running at the moment of termination.

from aiohttp.

novoxd avatar novoxd commented on May 25, 2024

I am wrong in the end of my last comment. It is happening because there is an asyncio.Task(back_task) running in background that were launched after aiohttp server start. On shutdown aiohttp will pick it and wait with 60s timeout for it to end, what will never happen. I thought that blocker is iojobs._wait_failed but because it is created in cleanup_ctx aiohttp is not waiting it.

Any task created before the application starts serving requests will be ignored from the wait step and not hold the application open.

@Dreamsorcerer it will not help because of ^. That's why my 2nd example failed.

I can say that it is an iojobs issue. But iohttp modification can help with this case and others. There is need for 3rd party packages to receive additional signal before aiohttp will start wait for background tasks to finish.

from aiohttp.

Dreamsorcerer avatar Dreamsorcerer commented on May 25, 2024

In my case I have a background task that can run for 5-10 minutes and will be triggered by incoming request.

Yes, if this is started after the application is running, that won't work. I'm not clear what the task is doing, so I'm wondering if it doesn't make more sense to have a task started when the application starts up and then feed updates to it via a Queue or similar.

If not, maybe we can create a method to add tasks to the set of skipped tasks manually?

from aiohttp.

novoxd avatar novoxd commented on May 25, 2024

I have a long running sh script launched by request. Will it be launched or not and amount of simultaneously running scripts is unknown from the start. Creating a bg task with equerries? Maybe. But it will be a duplication of logic what already iojobs contains.

In my case. From what I understood. Simplest solution from me is set timeout to zero. But it is a so so solution in wider case.

@Dreamsorcerer what do you think about using hypothetical three stage context in aiojobs and other 3d party packages? Like this. I think it will solve all type of this problems.

def setup(app: web.Application, **kwargs: Any) -> None:
    async def cleanup_context(app: web.Application) -> AsyncIterator[None]:
        app[AIOJOBS_SCHEDULER] = scheduler = Scheduler(**kwargs)
        yield
        closing = asyncio.create_task(scheduler.close()) # runs after ^C
        yield
        await closing

    app.cleanup_3stage_ctx.append(cleanup_context)

from aiohttp.

Dreamsorcerer avatar Dreamsorcerer commented on May 25, 2024

Hmm, seems tricky as it wouldn't be backwards compatible (existing jobs would now cleanup before waiting on running handlers) and it would add complexity when most cases don't need it.

I'm still thinking we can add a method to put a task into the ignored list.

from aiohttp.

Dreamsorcerer avatar Dreamsorcerer commented on May 25, 2024

Simplest solution from me is set timeout to zero.

Yes, in the short term, setting the timeout to something smaller would be the best approach (I'd still suggest giving it atleast 1 second though to give handlers a chance to send their responses).

from aiohttp.

ralphm avatar ralphm commented on May 25, 2024

This PR reworked the shutdown procedure and I believe it is not entirely correct, either. Particularly on the definition of runner.shutdown_callback itself inside _run_app. This currently reads:

    starting_tasks: "WeakSet[asyncio.Task[object]]" = WeakSet(asyncio.all_tasks())
    runner.shutdown_callback = partial(wait, starting_tasks, shutdown_timeout)

I have two questions here for @Dreamsorcerer:

  1. At this point asyncio.all_tasks() will always include _run_app itself. It doesn't seem reasonable to wait on itself here. Should it not be excluded (by checking asyncio.current_task(loop))?
  2. Why does this not attempt to call .cancel() on the task, instead of waiting for the giving timeout? Long running tasks requiring a wait can still do this by shielding CancelledError (temporarily).

from aiohttp.

Dreamsorcerer avatar Dreamsorcerer commented on May 25, 2024
1. At this point `asyncio.all_tasks()` will always include `_run_app` itself.

No it doesn't (it'd be pretty obvious if it was wrong...):
107dd24#diff-dee5c5266d56f6e644698f884e7e75550501b4302d6b87229695ea01bd031637R307-R309
107dd24#diff-dee5c5266d56f6e644698f884e7e75550501b4302d6b87229695ea01bd031637R314-R316

2. Why does this not attempt to call `.cancel()` on the task, instead of waiting for the giving timeout? Long running tasks requiring a wait can still do this by shielding `CancelledError` (temporarily).

Again, refer to the documentation above. It gives time to allow currently running handlers to complete cleanly before cancelling them, thus avoiding users getting a dropped connection.

from aiohttp.

Dreamsorcerer avatar Dreamsorcerer commented on May 25, 2024
  1. At this point asyncio.all_tasks() will always include _run_app itself. It doesn't seem reasonable to wait on itself here. Should it not be excluded (by checking asyncio.current_task(loop))?

Wait, you've completely misread that code. That's a list of tasks to not wait on...

from aiohttp.

ralphm avatar ralphm commented on May 25, 2024

Wait, you've completely misread that code. That's a list of tasks to not wait on...

Gah, I see now that I got it wrong, and after a bit of digging now understand how it works. I have the same issue as @novoxd: my application can spawn long running tasks after the initial startup that I would like to be cancelled immediately upon graceful shutdown, so they can clean themselves up. Tasks created by aiojobs, e.g. by using aiojobs.aiohttp.get_scheduler_from_app, will unfortunately be waited on just the same.

from aiohttp.

Dreamsorcerer avatar Dreamsorcerer commented on May 25, 2024

Yes, they are still just tasks. Again, I'm thinking that a reasonable solution would be to create a method to add tasks to the exclude set, though happy to hear other ideas.

from aiohttp.

Dreamsorcerer avatar Dreamsorcerer commented on May 25, 2024

Just thinking over this again, this could be achieved without any changes to aiohttp, right? Because on_shutdown is triggered before waiting on handlers, we can just do something like:

async def on_startup(app):
    app["scheduler"] = Scheduler()

async def on_shutdown(app):
    await app["scheduler"].close()

Using a second scheduler (with cleanup_ctx) if you want some tasks to cancel early and some not to.

from aiohttp.

Dreamsorcerer avatar Dreamsorcerer commented on May 25, 2024

We could add something to the documentation for this..

from aiohttp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.