Giter VIP home page Giter VIP logo

Comments (9)

bgoglin avatar bgoglin commented on August 23, 2024 1

Hello. We already have a debug message:

hwloc_debug("MPOL_PREFERRED_MANY not supported, reverting to MPOL_PREFERRED (with a single node)\n");

I guess I can make it a normal warning at least in hwloc-bind. And make it easier to understand.

However, if they are using the C API, we have many users who don't want underlying libraries to print warnings/errors.

from hwloc.

bgoglin avatar bgoglin commented on August 23, 2024 1

Here's a wording proposal:

[hwloc/membind] MPOL_PREFERRED_MANY not supported by the kernel.
If *all* given nodes must be used, use strict binding or the interleave policy.
Otherwise the old MPOL_PREFERRED will only use the first given node.

This non-critical error message is not shown by default in the library, but lstopo and now hwloc-bind increase verbosity to show it by default.
The message is only shown once per process. And only if preferred with multiple nodes.

from hwloc.

bgoglin avatar bgoglin commented on August 23, 2024 1

I just pushed the changes. They will be in 2.11 and maybe in a 2.10.1 (but not sure yet if I'll release that one). The plan is to release rc1 next week.
If you want to test it, there's a tarball at https://ci.inria.fr/hwloc/job/basic/job/v2.10/

from hwloc.

antoine-morvan avatar antoine-morvan commented on August 23, 2024

100% agree :)

from hwloc.

bgoglin avatar bgoglin commented on August 23, 2024

Quick question while working on the writing of the warning: if these people are coming back to you with performance degradation, it means that they are filling multiple entire NUMA nodes with huge allocations? If so, PREFERRED_MANY fills all the given nodes before moving to anything else, while PREFERRED uses anything else as soon as the first node is full. Correct?

from hwloc.

antoine-morvan avatar antoine-morvan commented on August 23, 2024

The application is not filling the NUMA node. Performance degradation comes from redirecting all the allocation, hence all transfers, onto 1 NUMA only, and hence 1 chanel, instead of multiple. This has 3 effects :

  1. divide available bandwidth by the number of chanels that are not used (until the numa:0 is filled, if it ever is depending on mem usage)
  2. increase latency for compute resources that are far from the numa:0 and
  3. increase contention on that only chanel

This can easily be observed with a stream. I ran a 20GB stream on this machine (showing only 1 out of 2 sockets), that easily fits in any of the NUMA :

image

⚠️ this is on pre 5.15 kernel

## hwloc-bind numa:0-3 --membind numa:0-3 --strict
Copy:          237975.1     0.060665     0.060159     0.060756
Scale:         238078.0     0.060447     0.060133     0.060513
Add:           245212.9     0.087834     0.087575     0.087992
Triad:         244882.9     0.088164     0.087693     0.088377

## hwloc-bind numa:0-3 --membind numa:0 --strict
Copy:           58934.9     0.243093     0.242918     0.247378
Scale:          58951.4     0.243054     0.242850     0.247871
Add:            60726.9     0.353891     0.353624     0.359667
Triad:          60697.1     0.354062     0.353798     0.355221

## hwloc-bind numa:0-3 --membind numa:0-3
## Preferred many is not supported; this is equivalent to hwloc-bind numa:0-3 --membind numa:0 as 1-3 are silently ignored
Copy:           58937.5     0.243038     0.242907     0.244884
Scale:          58956.4     0.242922     0.242829     0.243485
Add:            60725.3     0.353926     0.353634     0.360854
Triad:          60690.7     0.354357     0.353835     0.361484

Binding to 4 numas gives around 240GB/s, whereas binding to numa 0 only gives around 60GB/s (little less than 1/4 as expected due to contention & increased distance for numa 1-3). Since 20GB fit in one numa, using preferred many leads to the same result as numa:0 --strict .

Stream is pathological. Depending on the application (and its tendency to use memory) the impact can be different.

from hwloc.

bgoglin avatar bgoglin commented on August 23, 2024

Hmmm, so preferred_many does some sort of interleaving? I thought it would fill the first node only, then the second one, then 3rd, etc.

from hwloc.

antoine-morvan avatar antoine-morvan commented on August 23, 2024

I am with kernel pre 5.15, so this is silently ignoring preferred many in the last example.

A typical example of why people are coming back to me with perf degradation, and where I want the warning message to pop :)

from hwloc.

bgoglin avatar bgoglin commented on August 23, 2024

I am posting 2.11rc1 right now with this fix.

from hwloc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.