Comments (12)
This seems like a better fit for https://discourse.julialang.org/.
from julia.
Sorry, why? Doesn't this point at a problem with the task library? I can think of at least one reason: scheduling of tasks may be at fault (?)...
from julia.
Irrespective of whether this belongs here or not, can you try to provide a more minimal example?
from julia.
@carstenbauer That is hard. I tried, but for some reason much simpler setups did not yield reproducible behaviour like this. Even this example shows random variations: sometimes the tasks run equally well. But, start the sim one more time and some tasks will again lag behind the other group (it is always two groups: fast tasks and slow tasks). It is almost as if some tasks had to wait for a thread to run. Even though there should be enough for all to not to have to wait.
from julia.
It already is on discourse: https://discourse.julialang.org/t/parallel-assembly-of-a-finite-element-sparse-matrix/95947/
from julia.
But, start the sim one more time and some tasks will again lag behind the other group
(it is always two groups: fast tasks and slow tasks). It is almost as if some tasks
had to wait for a thread to run. Even though there should be enough for all to not to have to wait.
It seems I answered my own question:
Strategy A. When the computation is started with N threads and the number of tasks is N-1, it seems that at some point one or more tasks cannot find a thread to run on, wait for one, and finish consequently late.
Strategy B. The following approach works much better: start as many threads as you have on the machine, and use however many tasks you wish (leaving a substantial number of threads over). For instance, spin up julia with 24 threads, and use up to 16 tasks. Then all tasks will finish at the same time.
I wonder why some of the tasks cannot find a thread to run on in Strategy A? What else does julia run on those threads?
from julia.
Sounds like this is better suited to continue discussion on the discourse thread then. Github does not handle threads well.
from julia.
The problem appears to be real, wouldn't you say?
from julia.
Edit: Updated MWE:
module mwe_tasks_2
using Base.Threads
function work(r)
s = 0.0
for j in r
s = s + exp(-(j - minimum(r))^2/(maximum(r)-minimum(r))^2)
end
s
end
function test(nchunks = 2)
# @info "nchunks = $(nchunks)"
# @info "nthreads.((:default, :interactive)) = $(Threads.nthreads.((:default, :interactive)))"
# @info "maxthreadid = $(Threads.maxthreadid())"
# @info "threadpool.(1:Threads.maxthreadid()) = $(threadpool.(1:Threads.maxthreadid()))"
# @info "Threads.threadpoolsize.((:default, :interactive)) = $(Threads.threadpoolsize.((:default, :interactive)))"
N = 200000000
chincr = N / nchunks
@assert nchunks * chincr == N
chunks = [(((i-1)*chincr+1:i*chincr), i) for i = 1:nchunks]
s = Float64[]
start = time()
Threads.@sync begin
for ch in chunks
Threads.@spawn let r = $ch[1], i = $ch[2]
# @info "Chunk $(i), thread $(threadid()), $(Threads.threadpool(threadid())): Spawned $(time() - start)"
push!(s, work(r))
# @info "$(i): Finished $(time() - start)"
end
end
end
# @info "Finished $(time() - start)"
# @show s
end
end
using Main.mwe_tasks_2;
mwe_tasks_2.test()
ts = []
for n = 1:10
push!(ts, @elapsed mwe_tasks_2.test())
end
@show extrema(ts)
Many runs repeated. Often the max run time is twice the min, indicating scheduling problems.
from julia.
It does not sound real based on the current description. I think the usual approach is to divide the problem up into about 4nthreads()
chunks, to ensure adequate balancing possible
from julia.
@vtjnash I'm afraid I'm not following you. The workload in the MWE is perfectly distributed. I start enough threads so that each task can run on a thread, on a machine that has enough threads not to be oversubscribed. Yet apparently some tasks have to wait on a thread to run.
Would one consider this normal?
Creating more chunks means creating more tasks means spending more time in the setup of the parallel loop. Is that a good strategy?
from julia.
I think the @info
prints may have messed with the timing. I will close this issue now.
from julia.
Related Issues (20)
- Crash: protect_page: Permission denied HOT 1
- exiting process from REPL shell can cause segfault HOT 1
- 2.0: Make `propertynames(x)` return `()` by default
- Running `code_typed` changes result of `code_llvm` HOT 4
- jl_static_show is not safe to be called from within the GC HOT 2
- ERROR: Failed to precompile Pluto on Julia Nightly version 1.12.0-DEV
- lowering incorrectly recurses through Expr(:toplevel)
- Codegen emits a specsig call in multiple places which bitrots easily
- `fatal: error thrown` (TypeError from `typeassert`) & subsequent crash on `yield(current_task())` HOT 2
- Clarify expecations for valid `Val` parameters HOT 5
- `Base.literal_pow` behaves incorrectly for base of `-1`. HOT 3
- Function definitions are possible in for-loop header HOT 4
- Day(1) < Month(1) gives error
- `power_by_squaring(::Float16, ::Integer)` involves unnecessary conversion to `Float64`
- ccall(:foo) doesn't pick up LD_PRELOAD overrides HOT 2
- specialize on exact Vararg number for all calls that are not via `_apply`?
- ERROR: LoadError: type CodeInfo has no field linetable HOT 1
- Segfault when using improperly defined struct HOT 5
- Regression on number of allocations in GC micro-benchmark
- ~10% memory regression in inference benchmarks due to 'ir: Fix incorrect renaming of phinode values (#52614)'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from julia.