Giter VIP home page Giter VIP logo

tud-zih-energy / lo2s Goto Github PK

View Code? Open in Web Editor NEW
44.0 7.0 12.0 1.73 MB

Linux OTF2 Sampling - A Lightweight Node-Level Performance Monitoring Tool

Home Page: https://tu-dresden.de/zih/forschung/projekte/lo2s?set_language=en

License: GNU General Public License v3.0

CMake 7.64% C++ 92.28% Makefile 0.08%
linux trace otf2 sampling linux-perf-bindings cpu-profiling profiling kernel monitoring-tool

lo2s's Introduction

Build Current Release manpage

lo2s logo

lo2s is a lightweight node-level performance monitoring tool used to analyze applications, the operating system and hardware.

Lightweight Node-Level Performance Monitoring

lo2s creates parallel OTF2 traces with a focus on both application and system view. The traces can contain any of the following information:

  • From running threads
    • Calling context samples based on instruction overflows
    • The calling context samples are annotated with the disassembled assembler instruction string
    • The framepointer-based call-path for each calling context sample
    • Per-thread performance counter readings
    • Which thread was scheduled on which CPU at what time
  • From the system
    • Metrics from tracepoints (e.g. the selected C-state or P-state)
    • The node-level system tree (cpus (HW-threads), cores, packages)
    • CPU power measurements (x86_energy)
    • Microarchitecture specific metrics (x86_adapt, per package or per core)
    • Arbitrary metrics through plugins (Score-P compatible)
    • Syscall activity

In general lo2s operates either in process monitoring or system monitoring mode.

With process monitoring, all information is grouped by each thread of a monitored process group - it shows you on which CPU is each monitored thread running. lo2s either acts as a prefix command to run the process (and also tracks its children), or lo2s attaches to a running process.

In the system monitoring mode, information is grouped by logical CPU - it shows you which thread was running on a given CPU. Metrics are also shown per CPU.

In both modes, system-level metrics (e.g. tracepoints), are always grouped by their respective system hardware component.

Build Requirements

  • Linux (>= 4.3)1
  • OTF2 (>= 3.0)
  • libbfd
  • libiberty
  • CMake (>= 3.11)
  • A C++ Compiler with C++17 support and the std::filesystem library (GCC > 7, Clang > 5)

1: Older kernels can work as the required features are oftentimes backported. Otherwise lo2s 1.7.0 can be used, which is the newest lo2s version with support for kernels as old as even 2.6.32.

Optional Build Dependencies

  • x86_adapt for mircorarchitecture specific metrics
  • x86_energy for CPU power metrics
  • libradare (>= 5.8.0) for disassembled instruction strings
  • libsensors for sensor readings
  • libaudit to resolve syscall names, otherwise only syscall nrs can be used in syscall tracing
  • pod2man to generate the man pages (typically distributed as part of perl)
  • gzip to compress the man pages

Runtime Requirements

  • kernel.perf_event_paranoid should be less than or equal to 1 for process monitoring mode and less than or equal to 0 in system monitoring mode. A value of -1 will give the most features for non-root performance recording, such as tracepoints and block I/O, at the cost of some security. Modify as follows:

    sudo sysctl kernel.perf_event_paranoid=1

  • Tracepoints, block I/O and syscalls require access to debugfs. Grant permissions at your own discretion.

    sudo mount -t debugfs none /sys/kernel/debug

Installation

  • It is recommended to create an empty build directory anywhere.
  • cmake /path/to/lo2s
  • Configure cmake as usual, e.g. with ccmake .
  • make
  • make install

Usage

To monitor a given application in process monitoring execute

  • lo2s -- ./a.out --app-args

To monitor all activity on a system run

  • lo2s -a (stop the recording with ctrl+c)

For a full documentation of options see the manpage.

Usage with MPI

You can record simple traces from MPI programs, but lo2s does not record MPI communication. To create fully-featured MPI-aware traces, use Score-P.

  • lo2s mpirun ./a.out Create one trace of mpirun, useful if mpirun is used locally on one node.
  • mpirun lo2s ./a.out Creates a separate trace for each process.

See man lo2s or lo2s --help for a full listing of options and usage.

Quirks

The perf_event_open kernel infrastructure changed significantly over time. Therefore, it is already hard to just keep track which kernel version introduced which new feature. Combine that with the abundance of backports of particular features by different distributors, and you end with a mess of options.

In the effort to keep compatible with older kernels and some architectures that lack hardware breakpoint support, several quirks have been added to lo2s:

  1. The initial time synchronization between lo2s and the kernel-space perf is done with a hardware breakpoint. If your kernel or processor architecture doesn't support that, you can use a fallback using the CMake option USE_HW_BREAKPOINT_COMPAT.
  2. If you get the following error message: event 'ref-cycles' is not available as a metric leader!, you can fallback to the bus-cycles metric as leader using the lo2s command-lind argument --metric-leader bus-cycles.

Working with traces

Traces can be visualized with Vampir. You can use OTF2 or any of its tools. Native interfaces are available for C and Python

Acknowledgements

This work is supported in part by the German Research Foundation (DFG) within the CRC 912 - HAEC and the german National High Performance Computing (NHR@TUD).

Primary Reference

A description and use cases can be found in the following paper. Please cite this if you use lo2s for scientific work.

Thomas Ilsche, Robert Schöne, Mario Bielert, Andreas Gocht and Daniel Hackenberg. lo2s – Multi-Core System and Application Performance Analysis for Linux 📕 In: Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications (HPCMASPA). 2017. DOI: 10.1109/CLUSTER.2017.116

Additional References

Thomas Ilsche, Marcus Hähnel, Robert Schöne, Mario Bielert and Daniel Hackenberg: Powernightmares: The Challenge of Efficiently Using Sleep States on Multi-Core Systems 📕 In: 5th Workshop on Runtime and Operating Systems for the Many-core Era (ROME). 2017, DOI: 10.1007/978-3-319-75178-8_50

Thomas Ilsche, Robert Schöne, Philipp Joram, Mario Bielert and Andreas Gocht: System Monitoring with lo2s: Power and Runtime Impact of C-State Transitions 📕 In: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), DOI: 10.1109/IPDPSW.2018.00114

Thomas Ilsche, Mario Bielert, Christian von Elm: Bridging the Gap between Application Performance Analysis and System Monitoring 📕 In: 2022 IEEE International Conference on Cluster Computing (CLUSTER), DOI: 10.1109/CLUSTER51413.2022.00080

Name

The name lo2s is an acronym for Linux OTF2 Sampling

lo2s's People

Contributors

bertwesarg avatar bmario avatar cvonelm avatar phijor avatar rschoene avatar s9105947 avatar tilsche avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

lo2s's Issues

Sort and group options in usage message

This is possible when using more than one po::options_description.

The usage page should look more like this:

Usage:
  ./lo2s [options] ./a.out
  ./lo2s [options] -- ./a.out --option-to-a-out
  ./lo2s [options] --pid $(pidof some-process)

Allowed options:
  --help                                produce help message
  --version                             print version information
  -q [ --quiet ]                        suppress output
  -v [ --verbose ]                      verbose output (specify multiple times 
                                        to get increasingly more verbose 
                                        output)
  -o [ --output-trace ] arg             output trace directory
  --list-clockids                       list all available clockids
  --list-events                         list all available events
  -m [ --mmap-pages ] arg (=16)         number of pages to be used by each 
                                        internal buffer
  -k [ --clockid ] arg (=monotonic-raw) clock used for perf timestamps (see 
                                        --list-clockids for supported 
                                        arguments)

System-wide monitoring:
  -a [ --all-cpus ]                     System-wide monitoring of all CPUs.

Sampling options:
  --command arg
  -c [ --count ] arg (=11010113)        sampling period (# of events specified 
                                        by -e)
  -e [ --event ] arg (=instructions)    interrupt source event for sampling
  -g [ --call-graph ]                   call-graph recording
  -n [ --no-ip ]                        do not record instruction pointers [NOT
                                        CURRENTLY SUPPORTED]
  -p [ --pid ] arg (=-1)                attach to specific pid
  -i [ --readout-interval ] arg (=100)  time interval between metric and 
                                        sampling buffer readouts in 
                                        milliseconds
  --disassemble                         enable augmentation of samples with 
                                        instructions (default if supported)
  --no-disassemble                      disable augmentation of samples with 
                                        instructions
  --kernel                              include events happening in kernel 
                                        space (default)
  --no-kernel                           exclude events happening in kernel 
                                        space

Kernel trace point options:
  -t [ --tracepoint ] arg               enable global recording of a raw 
                                        tracepoint event (usually requires 
                                        root)

Perf metric options:
  -E [ --metric-event ] arg             the name of a perf event to measure
  --metric-leader arg (=ref-cycles)     name of leading perf event
  --metric-count arg                    # of events to elapse by metric leader 
                                        before reading metric buffer
  --metric-frequency arg                metric buffer reads per second

x86_adapt options:
  -x [ --x86-adapt-cpu-knob ] arg       add x86_adapt knobs as recordings. 
                                        Append #accumulated_last for semantics.

Add uname information to otf2

When tracing lots of different kernel versions it would be nice to have some uname information available in the trace.

I suggest to include several uname outputs, most importantly uname -r and uname -v, others won't hurt.

I suggest to use archive_.set_property("LO2S::...") as this is easily accessible in Vampir. It is a bit of "creative use", technically it should probably be some global definition like a system tree node property. But that would be much harder to show in Vampir.

Configurable perf metrics

  • Select perf metrics through command line
  • Add program_option, which prints all available metrics
  • Sample perf metrics also for -a
  • Configurable interrupt source
  • Use PERF_FORMAT_GROUP and reduce number of reads

Split location container in trace

Split into for unique identifiable locations. These can be repeatable accessed with the identifier:

  • thread_locations_
  • cpu_locations_
  • cpu_metric_locations_

And for fuzzy named locations, these will give a new location on every access:

  • named_metric_locations_

Reduce code duplication in trace::metric_writer

Add lo2s version

  • cmake gets git commit hash or tag
  • add output on help page // --version
  • add info in creator otf2 archive metadata

Unexpected exit in -a mode if /sys/kernel/debug is read-only

Run lo2s:

$ ./lo2s -a -- true
[1521149802129234474][pid: 25222][tid: 140210198903680][ERROR]: Aborting: basic_ios::clear: iostream error

I traced this down to the constructor of lo2s::perf::tracepoint::EventFormat throwing here, potentially here too.

What would be the correct behavior here? Rethrow a meaningful exception instead of std::ios_base::failure? Log and exit? Check lo2s::perf::tracepoint::get_sched_switch_event() before starting the trace?

Refactor Monitors

Currently there are sometimes multiple monitors running on the same core, also there are certain monitors that should be FdMonitors, but currently are IntervalMonitors.

Open Questions:

  1. Flexibility:
    How flexible should this be configurable at run-time, e.g.
  • Configure time intervals for reading x86adapt and x86energy separately?
  • Configure sample reader to be read at interval and/or watermark?
  1. Consolidation:
    Should multiple buffers (e.g. tracepoints and metrics) be both read even if only one triggers the watermark?
  • Pro: Overall less switches to userspace
  • Contra: Possibly more overhead from reading almost-empty buffers
  1. Dynamic/static:
  • Static (current): Each specific Monitor class knows it's specific reader types. For optional readers: if (foo_reader) { foo_reader->read(); } Indexing for fd's (non-consolidated) would be tricky.
  • Dynamic: All Readers need a common base class with a virtual read(). The Monitor would then contain a vector of them. Somewhat simpler, more generic code and easier fd-indexing. Performance cost of virtual dispatch, once per wake-up and Reader. Unholy mix of virtual and CRTP 😱.

In the end, we want as little threads and wake-ups as possible, but as many as necessary to exploit the parallelism. There is also the lingering TODO of splitting the MetricMonitor.

Score-P independend environment variables for metric plugins

Currently lo2s reuses the SCOREP_METRIC_PLUGINS to add metric plugins. It would be good if there is an Score-P independent environment variable to add metric plugins, e.g. a LO2S_METRIC_PLUGINS variable.

This would avoid confusion with Score-P metric plugins as LO2S has a slightly different interface (synchronous plug ins are not handled as far as I know) and unintended side effects are avoided if Score-P is traced.

Best,

Andreas

Overhead introspection

  • Write event records for perf mmap buffer flushes
  • After execution, print number of wakeups if not quiet.

Sampling in global view

The holy grail

  • Record per-cpu call stack samples into the global view (-a)
  • Create a thread-view trace from the global recordings optional, both or through Vampir

Don't record metric leader when there is no valid metric event

While nothing isn't a valid metric event, lo2s will still setup a metric recoding with only the group leader:

$ lo2s -E nothing -vv ls
[1518784342110364342][pid: 99654][tid: 22907724121920][ INFO]: caching event 'nothing'.
[1518784342110418762][pid: 99654][tid: 22907724121920][ INFO]: failed to cache event (reason: missing '/' in event description)
[1518784342110437550][pid: 99654][tid: 22907724121920][ WARN]: 'nothing' does not name a known event, ignoring! (reason: missing '/' in event description)
[1518784342119710536][pid: 99654][tid: 22907724121920][DEBUG]: perf::counter::Reader: sample_freq: 10Hz
[1518784342119717759][pid: 99654][tid: 22907724121920][DEBUG]: perf::counter::Reader: leader event: 'ref-cycles'

Disable --clockid / --list-clockids options if built without USE_PERF_CLOCKID

If USE_PERF_CLOCKID==OFF, any execution of lo2s (regardless of arguments) will result in a warning:

[1513092780631385996][pid: 19291][tid:139737018783488][ WARN]: This installation was built without support for setting a perf reference clock.
[1513092780631422019][pid: 19291][tid:139737018783488][ WARN]: Any parameter to -k/--clockid will only affect the local reference clock.

The warning should not appear, the respective options should not be available.

add sampling source as counter

When specifying a different sampling event (e.g., via -e cpu-cycles), the event is not recorded as a counter. Please add it.

Default metric leader selection

On systems there ref-cycles is not available, one has to manually change the metric leader.
We should have a list of suitable metric-leaders we try before nagging the user about it.

Possible list in this sequence:

  • ref-cycles
  • cpu-cycles

@rschoene anything else?

Improve error message when x86_energy isn't installed

Starting with d41121e, trying to generate the build files with cmake yields the following message:

CMake Warning at CMakeLists.txt:101 (find_package):
  By not providing "Findx86_energy.cmake" in CMAKE_MODULE_PATH this project
  has asked CMake to find a package configuration file provided by
  "x86_energy", but CMake did not find one.

  Could not find a package configuration file provided by "x86_energy"
  (requested version 2.0) with any of the following names:

    x86_energyConfig.cmake
    x86_energy-config.cmake

  Add the installation prefix of "x86_energy" to CMAKE_PREFIX_PATH or set
  "x86_energy_DIR" to a directory containing one of the above files.  If
  "x86_energy" provides a separate development package or SDK, be sure it has
  been installed.

There is no file cmake/Findx86_energy.cmake, did someone forget to add that to a commit or am I missing something?

Enable all pre-defined events as listed at perf list

please make events, defined in enum perf_hw_id, enum perf_hw_cache*id, and enum perf_sw_ids available under the naming scheme used in perf list,

e.g.,
lo2s -e minor-faults ...
should map to PERF_COUNT_SW_PAGE_FAULTS_MIN

lo2s fails as event 'ref-cycles' is not available as a metric leader!

I tried to run lo2s on taurus. Lo2s fails giving error

[1518523527725817000][pid: 14438][tid: 46969806325664][ INFO]: failed to cache event (reason: missing '/' in event description) [1518523527725831034][pid: 14438][tid: 46969806325664][ERROR]: event 'ref-cycles' is not available as a metric leader!

setting the flag -e bus-cycles also gives the above mentioned error.

Add support for raw metrics

In perf, I can pass raw metrics via the -e flag. This should also be supported by lo2s for -e and -E

Raw metrics are encoded in the following way rNNNN. More information is given in man perf-list

Summary size for large traces is wrong

Trace contains alot of metrics

$ lo2s -m 1024 -a -t sched/sched_switch -t power/cpu_idle
...
[ lo2s: system mode, monitored processes: 214, 111.666s CPU, 2045.56s total ]
[ lo2s: 69 wakeups, wrote 16777216.00 TiB lo2s_trace_2018-03-16T19-50-50 ]

$ du -sh lo2s_trace_2018-03-16T19-50-50
2.7G lo2s_trace_2018-03-16T19-50-50

Remove default metric channels

I think we shouldn't use any default metrics for various reasons:

  • They are only there because of legacy reasons and laziness
  • There is not this one apparent set of default metrics
  • They introduce overhead by default without the possibility to disable them
  • We haven't documented, which metrics are added by default
  • Passing one additional metric suddenly removes all default metrics, which is a surprising behavior

Support older Linux versions

Currently needs 4.1 (data_offset / data_size). Need to figure out if there other compatibility issues (time_enabled / time_active bug).

Need this for taurus.

Multi-node and instrumentation/event support

  • create a PMPI library that writes MPI events to an mmap'ed buffer. These events should include, for example: timestamp, pid, rank, MPI function, communication partners
  • read that buffer via lo2s, sort the events with the given sampling events and write them to OTF2
  • do the same for OMPT
  • check whether to use caliper for MPI events and buffer.

Fails with tracing bash

When running "lo2s bash", it crashes
./lo2s bash
[1508939755939220588][pid: 28270][tid:139947869783872][ERROR]: mmap failed. You can decrease the buffer size or try to increase /proc/sys/kernel/perf_event_mlock_kb
[1508939755939303320][pid: 28270][tid:139947869783872][ERROR]: Destructing IntervalMonitor before being stopped. This should not happen, but it's fine anyway.
lo2s: /home/rschoene/tmp/thrift/lo2s/src/monitor/interval_monitor.cpp:44: virtual void lo2s::monitor::IntervalMonitor::stop(): Assertion `thread_.joinable()' failed.
Abgebrochen (Speicherabzug geschrieben)

When reducing the mmap size to 32, it does not crash instantaneously, but when you call another bash within the traced bash
./lo2s -m 32 bash
$ bash
[1508939940847189757][pid: 28442][tid:140234546366272][ERROR]: mmap failed. You can decrease the buffer size or try to increase /proc/sys/kernel/perf_event_mlock_kb
[1508939940847269042][pid: 28442][tid:140234546366272][ERROR]: Destructing IntervalMonitor before being stopped. This should not happen, but it's fine anyway.
lo2s: /home/rschoene/tmp/thrift/lo2s/src/monitor/interval_monitor.cpp:44: virtual void lo2s::monitor::IntervalMonitor::stop(): Assertion `thread_.joinable()' failed.
Abgebrochen (Speicherabzug geschrieben)

When reducing the mmap size to 16, it does not crash instantaneously, but when you call another bash in a bash in a bash within the traced bash
./lo2s -m 16 bash
$ bash
$ bash
$ bash
[1508939999239586817][pid: 28592][tid:140343153018688][ERROR]: mmap failed. You can decrease the buffer size or try to increase /proc/sys/kernel/perf_event_mlock_kb
[1508939999239677983][pid: 28592][tid:140343153018688][ERROR]: Destructing IntervalMonitor before being stopped. This should not happen, but it's fine anyway.
lo2s: /home/rschoene/tmp/thrift/lo2s/src/monitor/interval_monitor.cpp:44: virtual void lo2s::monitor::IntervalMonitor::stop(): Assertion `thread_.joinable()' failed.
Abgebrochen (Speicherabzug geschrieben)

Use PERF_RECORD_COMM

PERF_RECORD_COMM provides information about name changes of processes and threads. This should allow better naming information. Requires Linux 3.16, so probably only optional for now.

Measurement summary

Show a short summary after execution. It is suppressed with -q.

Possible things to display

  • Number of spawned threads/processes during execution
  • Name and arguments of executed binary
  • Execution wall time / CPU time
  • Name of generated trace file
  • Size of generated trace file
  • Number of wakeups

Maybe this is already too much, we need to keep it concise.

Segfault when tracing cmake

In https://github.com/tud-zih-energy/lo2s/blob/master/src/mmap.cpp#L163 it can happen that mapping.dso == nullptr. This causes dead kittens and segfaults. 😿

As stuff goes wrong there anyways, we could just throw something. Though it would be better to fix the null dso in the first place.

Happens when tracing cmake .. in an empty build dir.

Value of ip:

(const lo2s::Address &) @0x2ad3c40009f0: {v_ = 47190393701150}

Contents of the map_:

std::map with 9 elements = {[{start = {v_ = 4194304}, end = {v_ = 5398528}}] = {start = {v_ = 4194304}, end = {v_ = 5398528}, pgoff = {v_ = 0}, dso = @0x1cbd628}, [{start = {v_ = 47190391128064}, end = {
      v_ = 47190391283712}}] = {start = {v_ = 47190391128064}, end = {v_ = 47190391283712}, pgoff = {v_ = 0}, dso = @0x2ad3a800c128}, [{start = {v_ = 47190391296000}, end = {v_ = 47190391304192}}] = {start = {
      v_ = 47190391296000}, end = {v_ = 47190391304192}, pgoff = {v_ = 0}, dso = @0x2ad3a8015ee8}, [{start = {v_ = 47190393389056}, end = {v_ = 47190396817408}}] = {start = {v_ = 47190393389056}, end = {
      v_ = 47190396817408}, pgoff = {v_ = 0}, dso = @0x0}, [{start = {v_ = 47190396817408}, end = {v_ = 47190398930944}}] = {start = {v_ = 47190396817408}, end = {v_ = 47190398930944}, pgoff = {v_ = 0}, 
    dso = @0x0}, [{start = {v_ = 47190398930944}, end = {v_ = 47190402904064}}] = {start = {v_ = 47190398930944}, end = {v_ = 47190402904064}, pgoff = {v_ = 0}, dso = @0x2ad3a8033128}, [{start = {
      v_ = 47190402904064}, end = {v_ = 47190405107712}}] = {start = {v_ = 47190402904064}, end = {v_ = 47190405107712}, pgoff = {v_ = 0}, dso = @0x2ad3a8033128}, [{start = {v_ = 47190405107712}, end = {
      v_ = 47190407278592}}] = {start = {v_ = 47190405107712}, end = {v_ = 47190407278592}, pgoff = {v_ = 0}, dso = @0x2ad3c4000c68}, [{start = {v_ = 18446744073699065856}, end = {
      v_ = 18446744073699069952}}] = {start = {v_ = 18446744073699065856}, end = {v_ = 18446744073699069952}, pgoff = {v_ = 0}, dso = @0xa15708}}

Handle cmd argument on lo2s -a

If a command is passed lo2s -a -- ./command. The command is spawned and lo2s will record until it is finished - but not monitor it otherwise.

Use perf list syntax for tracepoint events

Currently, tracepoint events are specified with slashes e.g., exceptions/page_fault_kernel, perf list names them with colons, e.g., exceptions:page_fault_kernel.

I'd vote for the perf list style. Any other opinions?
@tilsche @bmario @AndreasGocht @phijor @cvonelm

(Vote yes/no/0, everyone has one vote, as soon as there's a majority for "yes" the issue will be assigned, as soon as there's a majority for "no", the issue will be closed, 0 is for abstention)

Error when not passing an executable to trace

Here:
https://github.com/tud-zih-energy/lo2s/blame/c8b867af83150f612b59fcf084ca525fec755caf/src/config.cpp#L157
config.monitor_type is not yet set. which leads to the following output when running los2 without any argument:
terminate called after throwing an instance of 'std::out_of_range'
what(): vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
terminate called after throwing an instance of 'std::out_of_range'
what(): vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)

Add some integration test to travis

Possible tests (within TravisCI):

  • Running without any arguments prints usage
  • --help prints help message
  • --version prints version info
  • Short -a works
  • Short process sampling works (e.g. sleep 1)

Note: Travis is a VM, so there is probably no perf possible

Create TraceMerger

We will someday need a tool, which allows us to merge two or more OTF2 traces.

Add process tree

Somehow embed the process tree in the trace, e.g. by making another system-tree with class "process" and hang the LOCATION_GROUPs there.

Improve help message

  • With #82 merged, we should look once again at the categories and sort once more
  • The useful behavior of -- should also be documented.
  • How to add perf probes
void __attribute__((optimize("O0")))
my_marker(int some_variable)
{
}

sudo perf probe -x ./a.out my_marker some_variable
lo2s -t probe_a:my_marker ...

  • document LO2S_OUTPUT_LINK (once merged)
  • document LO2S_OUTPUT_TRACE

Better time synchronization

  • Make use of use_clockid (since 4.1) to set a good clock for perf
  • Add option to list all available clocks
  • Use time_shift, time_mult, time_offset (since forever) // time_zero (since 3.12) to make time conversion

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.