Comments (6)
Ah yes, it's the same behaviour as Will sketched out using a worker thread. I'll ask for some advice on this from our local friendly neighbourhood OpenMP runtime expert... Standby!
from adaptivecpp.
The OpenMP specification says in https://www.openmp.org/spec-html/5.0/openmpsu112.html
The omp_get_max_threads routine returns an upper bound on the number of threads that could be used to form a new team if a parallel construct without a num_threads clause were encountered after execution returns from this routine.
It only talks about future parallel constructs, not about current parallel constructs, so to me it does not sound like it is illegal to use outside of a parallel region.
Perhaps we can ask someone who knows more about OpenMP - @tomdeakin perhaps?
from adaptivecpp.
I agree that omp_get_max_threads
should be callable outside a parallel region. I did some more digging with CCE - it strongly objects to OpenMP calls in additional threads:
#include <omp.h>
#include <iostream>
#include <thread>
int main(int argc, char ** argv){
auto lambda_max = []() -> void{
std::cout << omp_get_max_threads() << std::endl;
};
// Works
//lambda_max();
//lambda_max();
// fails
std::thread t1(lambda_max);
t1.join();
std::thread t2(lambda_max);
t2.join();
return 0;
}
Fails with:
$ OMP_NUM_THREADS=1 ./a.out
1
CCE OpenMP fatal error: omp_get_max_threads attempted from non-OpenMP thread
A parallel region is also rejected:
#include <omp.h>
#include <iostream>
#include <thread>
int main(int argc, char ** argv){
auto lambda_omp = []() -> void{
#pragma omp parallel
{
}
};
std::thread t1(lambda_omp);
t1.join();
std::thread t2(lambda_omp);
t2.join();
return 0;
}
Gives:
CCE OpenMP fatal error: OpenMP parallel attempted from non-OpenMP thread
from adaptivecpp.
There is one more relevant bit of the OpenMP spec which I think I think is the bit that's causing the problem. The API call says:
The value returned by omp_get_max_threads is the value of the first element of the nthreads-var ICV of the current task.
This implicitly says that an OpenMP thread must call it because there must be a current task (everything in OpenMP is a task). OpenMP bootstraps this by making an initial thread when the program starts.
So in that std::thread
example, the thread calling the API routine doesn't have all the OpenMP state.
Calling the lambda works in your example above because it's the initial thread that's calling the API routine.
What thread in AdaptiveCpp is trying to call the OpenMP API functions?
from adaptivecpp.
Thanks for the insights @tomdeakin :)
The only use of this function I've found is here:
which means it's invoked right before the parallel for which executes the kernel. All of this is executed in a worker thread of the runtime, which is then probably where the problem originates: The worker thread is not an OpenMP thread.
This is very unfortunate, because it basically means that OpenMP is unusable in multi-threaded applications. Or have I misunderstood something?
from adaptivecpp.
@tomdeakin Don't suppose there was any news on this?
from adaptivecpp.
Related Issues (20)
- Add NDzip benchmark HOT 3
- joint_exclusive_scan does not work in-place HOT 1
- [stdpar] amdgcn-link command failed with exit code 1 HOT 3
- cuda error HOT 1
- SMCP for OpenCL? HOT 3
- cannot find memory and stdio HOT 2
- sycl::queue::memcpy doesn't work from device to host memory HOT 1
- Linking Error with HIP SDK 5.7 (Windows) HOT 2
- Problem when using HIP backend with `hip.explicit-multipass` (Windows)
- cmake error HOT 14
- test code fails on first run HOT 1
- cmake error: Could not find a package configuration file HOT 2
- GoogleTest DeathTests not working correctly with --acpp-stdpar
- undefined symbol in libaccp-clang.so HOT 8
- Typo in error report HOT 1
- Using `sycl::atomic_ref<...>{}.store()` defaults to `__ATOMIC_SEQ_CST` leading to a compiler error. HOT 3
- Problem with CUDA backend HOT 2
- Compatibility issue with latest Microsoft C++ Library on Windows
- [stdpar] Hang inside std::transform_reduce in TeaLeaf on Intel(R) Data Center GPU Max 1100 / OpenCL HOT 3
- Host-to-host memcpy broken with with instant submission on CUDA backend
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from adaptivecpp.