Comments (9)
Hello. We already have a debug message:
hwloc_debug("MPOL_PREFERRED_MANY not supported, reverting to MPOL_PREFERRED (with a single node)\n");
I guess I can make it a normal warning at least in hwloc-bind. And make it easier to understand.
However, if they are using the C API, we have many users who don't want underlying libraries to print warnings/errors.
from hwloc.
Here's a wording proposal:
[hwloc/membind] MPOL_PREFERRED_MANY not supported by the kernel.
If *all* given nodes must be used, use strict binding or the interleave policy.
Otherwise the old MPOL_PREFERRED will only use the first given node.
This non-critical error message is not shown by default in the library, but lstopo and now hwloc-bind increase verbosity to show it by default.
The message is only shown once per process. And only if preferred with multiple nodes.
from hwloc.
I just pushed the changes. They will be in 2.11 and maybe in a 2.10.1 (but not sure yet if I'll release that one). The plan is to release rc1 next week.
If you want to test it, there's a tarball at https://ci.inria.fr/hwloc/job/basic/job/v2.10/
from hwloc.
100% agree :)
from hwloc.
Quick question while working on the writing of the warning: if these people are coming back to you with performance degradation, it means that they are filling multiple entire NUMA nodes with huge allocations? If so, PREFERRED_MANY fills all the given nodes before moving to anything else, while PREFERRED uses anything else as soon as the first node is full. Correct?
from hwloc.
The application is not filling the NUMA node. Performance degradation comes from redirecting all the allocation, hence all transfers, onto 1 NUMA only, and hence 1 chanel, instead of multiple. This has 3 effects :
- divide available bandwidth by the number of chanels that are not used (until the numa:0 is filled, if it ever is depending on mem usage)
- increase latency for compute resources that are far from the numa:0 and
- increase contention on that only chanel
This can easily be observed with a stream. I ran a 20GB stream on this machine (showing only 1 out of 2 sockets), that easily fits in any of the NUMA :
## hwloc-bind numa:0-3 --membind numa:0-3 --strict
Copy: 237975.1 0.060665 0.060159 0.060756
Scale: 238078.0 0.060447 0.060133 0.060513
Add: 245212.9 0.087834 0.087575 0.087992
Triad: 244882.9 0.088164 0.087693 0.088377
## hwloc-bind numa:0-3 --membind numa:0 --strict
Copy: 58934.9 0.243093 0.242918 0.247378
Scale: 58951.4 0.243054 0.242850 0.247871
Add: 60726.9 0.353891 0.353624 0.359667
Triad: 60697.1 0.354062 0.353798 0.355221
## hwloc-bind numa:0-3 --membind numa:0-3
## Preferred many is not supported; this is equivalent to hwloc-bind numa:0-3 --membind numa:0 as 1-3 are silently ignored
Copy: 58937.5 0.243038 0.242907 0.244884
Scale: 58956.4 0.242922 0.242829 0.243485
Add: 60725.3 0.353926 0.353634 0.360854
Triad: 60690.7 0.354357 0.353835 0.361484
Binding to 4 numas gives around 240GB/s, whereas binding to numa 0 only gives around 60GB/s (little less than 1/4 as expected due to contention & increased distance for numa 1-3). Since 20GB fit in one numa, using preferred many leads to the same result as numa:0 --strict
.
Stream is pathological. Depending on the application (and its tendency to use memory) the impact can be different.
from hwloc.
Hmmm, so preferred_many does some sort of interleaving? I thought it would fill the first node only, then the second one, then 3rd, etc.
from hwloc.
I am with kernel pre 5.15, so this is silently ignoring preferred many in the last example.
A typical example of why people are coming back to me with perf degradation, and where I want the warning message to pop :)
from hwloc.
I am posting 2.11rc1 right now with this fix.
from hwloc.
Related Issues (20)
- Selecting default and several nodes with --best-memattr HOT 7
- NVIDIA PCI Gen4 link speed from NVML is wrong HOT 10
- LoadLibrary("kernel32.dll") should be LoadLibrary(TEXT("kernel32.dll")) in topology-windows.c HOT 5
- Fix NVML NVLink version
- Non contiguous physical numbering of cores HOT 2
- Count memory tiers with command line tool HOT 5
- Please add icons for hwloc HOT 1
- build hwloc with clang "--target=${target_host}" "--host=${target_host}" no work HOT 9
- new cpuset format on the command-line
- F-Droid can't build HOT 13
- Including built hwloc in linker for cross toolchain compilation HOT 1
- Clarify bash (4?) requirement
- build hwloc 2.11.1 on old suse 9: undefined reference to `hwloc_accessat' HOT 1
- use listmount()/statmount() new syscalls instead of parsing /proc/mounts
- use libcg to get cgroup mount point
- hwloc-distrib segfaults/asserts when being passed 0 HOT 1
- cleanup cpu info management from Linux /proc/cpuinfo
- no new types for module, cluster, tile, complex HOT 1
- hwloc_internal_cpukinds_dup() does not correctly set topology->nr_cpukinds_allocated HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hwloc.