Giter VIP home page Giter VIP logo

Comments (20)

mreineck avatar mreineck commented on July 3, 2024 1

OK, might be good to add a function like actual_backend(), so that the current backend can be identified.
If you switch to the ducc backend, please make sure to select an appropriate value for nthreads, otherwise you'll be overcommitting your hardware (since you already are running in parallel on the caller side).

from shtools.

MarkWieczorek avatar MarkWieczorek commented on July 3, 2024

I have used the fortran routines with openmp in the past without any problems. Could you give me more info about when this fails and when it doesn't? Or does it fail for even the most simply tasks?

from shtools.

mreineck avatar mreineck commented on July 3, 2024

If the problem is indeed FFTW-related, it's most likely not connected with ducc, since ducc doesn't use FFTW internally.

from shtools.

oanegros avatar oanegros commented on July 3, 2024

Oh interesting! I had it fail when multiple threads were trying to execute SHExpandGLQ calls on different datasets (with the same dimensions) at the same time. The failure was then a segmentation fault without clear error codes. I found the FFTW thread safety information, and saw that the variables they talk about there were edited in the SHExpandGLQ code.
I tried now changing backend to ducc as @mreineck suggested, but my segmentation faults stay.

The threading implementation i'm using is a python-rewrite of Java threading that is implemented in the bigger software i'm writing in, so this may also have specific vulnerabilities?

I managed to fix it in my code by encapsulating the SHExpandGLQ in an RLock - so it's not very pressing for me, but weird that it happens especially if it's stable with openMP

from shtools.

MarkWieczorek avatar MarkWieczorek commented on July 3, 2024

I don't use openmp very often, so I'll need to investigate...

from shtools.

mreineck avatar mreineck commented on July 3, 2024

OK, I have an idea why the Fortran code cannot be called concurrently: in the function you linked there are arrays with the "save" attribute, which means that they are equivalent to static variables in C/C++, i.e. global state. Calling into such a function concurrently will almost certainly lead to undesired effects.

Not sure why you also encounter this with the ducc back-end as well though ... it shouldn't have any global state.

Semi-related question: how do you manage to execute concurrent calls from Python? Doesn't the GIL prevent this?

from shtools.

mreineck avatar mreineck commented on July 3, 2024

Sorry, I thake that back: all variables marked save are also threadprivate, so this could in principle work.

On the other hand, threadprivate is an OpenMP-specific keyword, so I have no idea what happens if this is called from multiple non-OpenMP threads...

from shtools.

oanegros avatar oanegros commented on July 3, 2024

I had to look it up to figure out how the software im using is doing this: I'm never releasing the GIL, I call SHExpandGLQ in multiple consecutive threads (that exist for other non-GIL-locked reasons outside of my code) and it seems like the entering into SHExpandGLQ unlocks the GIL somewhere, causing another of my threads to progress until it collides inside of pyshtools.

from shtools.

oanegros avatar oanegros commented on July 3, 2024

The GIL unlocking was proposed in #304 and is done here it seems

from shtools.

mreineck avatar mreineck commented on July 3, 2024

Sure! I was just wondering how you can call the SHTOOLS Python interface from two concurrently running Python threads. I most likely have an oversimplified picture in my mind of what you are actually doing.

from shtools.

mreineck avatar mreineck commented on July 3, 2024

To get a clearer picture: which function from the SHTOOLS Python interface are you calling exactly?

from shtools.

oanegros avatar oanegros commented on July 3, 2024

the only function i have isssues with is SHExpandGLQ, if i lock this function to wait for thread execution, my code works fine

from shtools.

oanegros avatar oanegros commented on July 3, 2024

And i have to say i dont fully understand your question on having multiple concurrent python threads, this is something that does not require the release of the GIL (such as the multiprocessing.ThreadPool implementation)

from shtools.

oanegros avatar oanegros commented on July 3, 2024

I'll try to make a minimal working example soon 😄

from shtools.

mreineck avatar mreineck commented on July 3, 2024

Thanks, that will certainly be very helpful!

from shtools.

oanegros avatar oanegros commented on July 3, 2024
from concurrent import futures
import pyshtools as pysh
import numpy as np

your_patches = np.random.randint(1,1e6, size=(100, 251,503))

zero, w = pysh.expand.SHGLQ(251)
with futures.ThreadPoolExecutor(max_workers=10) as executor:
    jobs = [executor.submit(pysh.expand.SHExpandGLQ, patch, w, zero) for patch in your_patches]
    [fut.result() for fut in futures.as_completed(jobs)]

This breaks with a segmentation fault for me, but works with max_workers=1 . The 251 lmax is just because i use this as a default grid size (~80*π), but it also breaks with other lmax that i tested

from shtools.

mreineck avatar mreineck commented on July 3, 2024

OK, I can reproduce the segmentation fault when the standard shtools backend is used, but not with ducc. Do you have the ducc0 Python package installed? If not, I think the backend will be silently reverted to shtools.

from shtools.

oanegros avatar oanegros commented on July 3, 2024

Ah okay, i did not notice because print(pysh.backends.preferred_backend()) did seem to return ducc when this was not available. With the actual ducc backend it does work 😄 . I might switch to this backend for my project.
But I would still say that either the calls should be thread-safe or this behavior needs to be documented for the shtools backend.

from shtools.

oanegros avatar oanegros commented on July 3, 2024

I tried to now implement this in my main project, but there i seem to (currently) be limited to pyshtools v4.9.1 and ducc0 v0.26.0 due to other dependencies. Here the segmentation faults still happen with preferred_backend ducc. Is this not correctly setting the backend, or is the thread safety of ducc a newer thing?

from shtools.

mreineck avatar mreineck commented on July 3, 2024

As far as I know, there shouldn't have been any relevant ducc changes snice 0.26, but we did a lot of tweaking to the backend selection code inside pyshtools after v4.9.1. I'm not sure that this causes the differences, but I'd suspect it.

from shtools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.