Giter VIP home page Giter VIP logo

trio-parallel's People

Contributors

dependabot[bot] avatar github-actions[bot] avatar richardsheridan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

trio-parallel's Issues

Test for zombie, memory, and pipe leaks

I had originally assumed that multiprocessing was going to take care of all this for us, but in #34 @zgoda claims that hypercorn --debug -k trio somehow leads to a semaphore leak on each run_sync invocation. Since we are doing Readme Driven Development and claim "no leaks" we should develop the automated tests to show that there are no leaks in a suite of example usages and regression tests.

new release

please release pending changes

I'm mostly interested in warning fixes for python 3.12

Blockers for 1.0.0 release

  • #89
  • API to control graceful shutdown, esp. atexit handler
  • Maybe split retire API into init/retire
  • Review of docs, especially intro, examples, api details
  • #170 or #171

Unclean shutdown on Ctrl-C

If my code calls any sync function in parallel, i'm getting stack trace printed in terminal upon exiting program with Ctrl-C:

^CProcess trio-parallel worker process 0:
Traceback (most recent call last):
  File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/jazg/work/chitty/venv/lib/python3.9/site-packages/trio_parallel/_proc.py", line 61, in _work
    barrier.wait(timeout=IDLE_TIMEOUT)
  File "/usr/lib/python3.9/threading.py", line 635, in wait
    self._wait(timeout)
  File "/usr/lib/python3.9/threading.py", line 670, in _wait
    if not self._cond.wait_for(lambda : self._state != 0, timeout):
  File "/usr/lib/python3.9/multiprocessing/synchronize.py", line 313, in wait_for
    self.wait(waittime)
  File "/usr/lib/python3.9/multiprocessing/synchronize.py", line 261, in wait
    return self._wait_semaphore.acquire(True, timeout)
KeyboardInterrupt

Python 3.9.2 on Ubuntu 20.04
Trio 0.18, trio-parallel 0.3

RFC: Allow cache customization

The current cache is one-size-fits-all, but surely someone will want to experiment with fork or forkserver, or kill WorkerProcs after a certain number of jobs, or always obtain fresh workers. If subinterpreters become a thing, they should also be a choice for users to select.

I think a nice way to do this would be with contextvars and a sync context manager:

async def f():
    with trio_parallel.cache_scope(**options):
        await trio_parallel.run_sync(print, "using a fresh cache")

But certain options like telling the cache not to reuse a certain worker would be amenable to a simple keyword argument:

await trio_parallel.run_sync(fib, 100, fresh_worker=True, destroy_after=True)

I guess these are not exclusive, but I don't want to clutter the API... so I'm leaning towards only making keyword arguments in response to specific user demands.

Reduce reliance on trio internals

Especially the windows pipes pull a lot of bits and bobs out of private namespaces. These should be vendored or better yet made into an external trio ipc channels package.

import trio eagerly while keeping trio out of workers

In order to speed up worker process startup, we lazy-import trio everywhere. This stinks from a readability standpoint and makes #199 impossible. Some other method of keeping trio out of workers would be better. Ideas:

  • Make a separate _trio_parallel_workers package to hold worker behavior functions
    • This would definitely work but feels tacky
  • Run worker behavior off exec and stringified source
    • Not sure if it would even work and may ruin test coverage measurement

Consider ramifications of 'fork' multiprocessing context on linux

Considering the forks happen in a thread, the surrounding trio run thread kind of "evaporates" when forked. Does it leave anything in a terrible state?

https://stackoverflow.com/questions/39890363/what-happens-when-a-thread-forks
https://stackoverflow.com/questions/6078712/is-it-safe-to-fork-from-within-a-thread
https://pythonspeed.com/articles/python-multiprocessing/

It seems as though any random library could induce deadlocks, which would need to be broken with SIGKILL. Technically, we have the machinery to cope with that, but spawning by default would be a cleaner story.

Make a pytest plugin

Would be nice to have a public fixture to shutdown and clear the default cache. The fixture would make it clear that it shouldn't be used to reset the cache in normal code, whereas a plain context manager might make things confusing.

On the other hand, not everybody uses pytest. how can we accommodate them?

hint about systemd subprocess termination

A colleague discovered that the default systemd KillMode (control-group) will kill subprocesses indiscriminately. Change it to mixed for SIGTERM of main, followed by SIGKILL of subprocesses after a timeout.

https://www.freedesktop.org/software/systemd/man/systemd.kill.html

This was causing `BrokenWorkerProcessError: ('Graceful shutdown failed: ...') errors from trio-parallel in shutdown.

Please consider adding a hint to the documentation, for systemd users.

changing default WorkerContext

It would be nice to have a way to change the default WorkerContext, for example:

trio_parallel.set_default_context(WorkerContext(idle_timeout=math.inf))

I'd like to change the idle timeout globally. Currently, I'd have to open a worker context at the root nursery, and store that into a contextvar or something, and hope that everyone remembers to use that context.

The default parameters of WorkerContext are quite arbitrary, so I think it would make sense to have a way to change the default context.

Tracking issue: Multiprocessing pipe cleanup error

A pipe FD can be invalid before it hits the multiprocessing __del__ code on py311, macos.

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/connection.py", line 133, in __del__
    self._close()
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/connection.py", line 377, in _close
    _close(self._handle)
OSError: [Errno 9] Bad file descriptor

So far the context is not very informative.

Examples:

Fix release artifact uploads back to github

Even though releasing 1.2.1 succeeded, there was a big red X at the end of the workflow:

https://github.com/richardsheridan/trio-parallel/actions/runs/6758319696/job/18369883458#step:5:7

I put in a commented-out fix here:

# contents: write # TODO: Is this the right permission to fix gh release upload?
steps:
- name: Download build artifact
uses: actions/[email protected]
with:
name: Build
path: dist
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
# TODO: what permission do I need to get this to work?
# - name: Upload to GitHub
# env:
# GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# run: |
# gh release upload ${{ github.ref_name }} dist/* --repo ${{ github.repository }}

based on some wild inferences from a google search but also these refs in the GHA docs:

https://docs.github.com/en/rest/overview/permissions-required-for-github-apps?apiVersion=2022-11-28#repository-permissions-for-contents
https://docs.github.com/en/rest/releases/assets?apiVersion=2022-11-28#update-a-release-asset

although those would only apply if gh release upload actually uses that api endpoint and not who knows what else.

  • investigate effectiveness of fix
  • investigate implications of setting contents: write permission

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.