Giter VIP home page Giter VIP logo

Comments (7)

ctk21 avatar ctk21 commented on July 24, 2024

Right now if you want to provide taskset for parallel benchmarks things are manual and error prone. You have to edit the JSON file for each benchmark to put the correct cpu-list in.

In an ideal world we would have a function get_taskset_command <num_domains> which returns the taskset command by probing the machine to figure out which CPUs should be used (and it could also handle the cross platform problems where taskset may not be available).

There are a couple of things the user wants control over:

  • should 'hyper-thread' cores be used?
  • which NUMA zones should be used on machines with multiple NUMA zones?
  • in situations with isolated cores, how do we specify the cores?

Given these features. I think the following might work as a proposal:
get_taskset_command <available_cpus> <num_domains>
where

  • <available_cpus> is a comma separated ordered list of available CPUs
  • <num_domains> is the number of CPUs to take from the head of <available_cpus>

For example:
get_taskset_command 0,2,4,6 3
would return
taskset --cpu-list 0,2,4

This would allow us to specify the CPUs and the order to take them in a single place and so make the config file easier to use.

On a non-isolated Linux machine where you want to use hyper-threaded cores, you could run lscpu | egrep 'NUMA node[0-9]' to get <available_cpus> and copy them to the config. Users that want more control can construct the list in alternative ways (e.g. 'only isolated cores', 'use no hyperthread pairs', etc.)

from sandmark.

kayceesrk avatar kayceesrk commented on July 24, 2024

get_taskset_command seems like a good idea. Where do you think this command would get invoked? If you make this as part of orun then the other wrappers will have to somehow replicate this.

One possibility is to generate the run_config.json file with the correct CPU list for taskset using the get_taskset_command

from sandmark.

ctk21 avatar ctk21 commented on July 24, 2024

My thinking was that this would be a stand alone UNIX command that could be composed with others. It could even be a shell script.

You could imagine the user having it in the wrappers section of the run_config.json:

"wrappers": [
    {
      "name": "orun",
      "command": "orun -o %{output} -- get_taskset_command %{ordered_cpu_list} %{num_domains} %{command}"
    }
 ],
...
 "benchmarks": [
   {
      "executable": "benchmarks/multicore-numerical/binarytrees5_multicore.exe",
      "name": "binarytrees5_multicore",
      "runs": [
        { "params": "1 23", "num_domains": "1" },
        { "params": "2 23", "num_domains": "2" },

The changes being:

  • utilize the new get_taskset_command in the wrappers section
  • have num_domains in the runs section of the benchmarks
  • find a way to introduce the ordered_cpu_list variable; not sure how to do this, we could just make people put it as a list into the wrapper. Might be nicer to make it a variable somehow

I'd rather have get_taskset_command as a first class command that can be composed (instead of say embedded in orun) because this might be easier to ensure we support more platforms (e.g. OS X, Windows, etc) and also a bit more transparent to the user when doing custom experiments.

from sandmark.

kayceesrk avatar kayceesrk commented on July 24, 2024

Ok. that makes sense. For ordered_cpu_list, this should be part of the variant json file describing the compiler. The ordered_cpu_list will be the same for all of the benchmarks. We need to just have different variant files for different machines.

Ideally, I want to get Sanmark to a point where all the variables either in the variant json file or the run_config json file. The various environment variables that we pass to the Makefile can be moved to the variant json file.

from sandmark.

ctk21 avatar ctk21 commented on July 24, 2024

ordered_cpu_list is a bit new for us as it is a parameter describing the machine. I would request not putting it in the compiler variant json file - that would be a really unexpected place to find it (at least for me), since the description of how to find the source and build parameters for a compiler has nothing to do with machine specs for the experiment.

It feels like the run_config.json is the right place to put it as that describes the run plan of a given experiment. The wrappers block of the run_config.json file isn't the worst place to put ordered_cpu_list (at least for now).

I agree with the goal of having a 'single source of truth' for what will be run in an experiment. Having things in unexpected places is a pain.

from sandmark.

ctk21 avatar ctk21 commented on July 24, 2024

I guess the direction I'm trying to go is that as a user, it would be great if I could post a single config file to somebody and they can recreate the run as well as see what got run. Maybe one day the variant files need to end up in the run_config.json?

from sandmark.

kayceesrk avatar kayceesrk commented on July 24, 2024

Ok. That works. Having everything in a single location sounds like a fine idea.

from sandmark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.