ostash / perfgrind Goto Github PK

perfgrind - tools for collecting samples from Linux performance events subsystem and converting profiling data to callgrind format, allowing it to be read with KCachegrind

License: Other

C++ 72.24% C 23.73% Makefile 4.03%

performance profiler linux-kernel perf callgraph callgrind kcachegrind profiling

perfgrind's Introduction

perfgrind

This is 'perfgrind', tools for collecting samples from Linux performance events subsystem and converting profiling data to callgrind format, allowing it to be read with KCachegrind.

Because of its own simplified format containing only the data necessary for creating the callgrind profile, the resulting file is commonly much smaller. One additional reason is that perfgrind explicitly ignores the kernel space during profiling.

Note: Perfgrind has a known limitation which is on the TODO list - it currently does not handle separate debug (neither on disk nor via debuginfod). Compiling with debug info and collecting data from non-stripped binaries will provide you with useful tracing data; especially when calling into system libraries you may see entries like func_7f2192e087070 in ld-2.31.so and similar.

License

This software is available to everyone under the license GPLv2.
It uses parts of code derived from:

Linux kernel, license is GPLv2
elfutils, license is GPLv3 or GPLv2 or LGPLv2

Usage

Overview

collect samples using pgcollect into perfgrind format
convert collected samples into callgrind format using pgconvert
open resulting file in KCachegrind

`pgcollect` - collect samples

Usage: pgcollect filename.pgdata [-F freq] [-s] {-p pid | [--] cmd}

Options to specify output:

filename.pgdata name of output file

Options to adjust profiling:

-F freq profile at the given frequency freq
-s profile using software events

Options to specify target:

-p pid profile running process with PID=pid
cmd command to profile, prefix with -- to stop command line parsing

`pgconvert` - convert collected samples to callgrind format

Usage: pgconvert [-m {flat|callgraph}] [-d {object|symbol|source}] [-i] filename.pgdata [filename.grind]
Note: If no output name is specified, then stdout will be used instead.
Examples:

overview showing call stack
pgconvert -d symbol filename.pgdata overview.grind
full data with source annotation and instructions
pgconvert -i filename.pgdata full.grind

Options to adjust generated callgrind data:

-d specify detail level; default is "source"
-i dump instructions, only possible with detail level "source"
-m mode default mode is "callgraph" if detail level is not "object"

Note: To collect with hardware counters you may have to adjust the kernel parameter perf_event_paranoid as root.

`pginfo` - show event count and calculated entries

Usage: pginfo {flat|callgraph} filename.pgdata

flat simple calculation, fast way to show number of events
callgraph full calculation

Building

Dependency elfutils

either install from source or - preferably - via package manager, for example by issuing yum install elfutils-devel or apt install libdw-dev

Building the source

optional step: create site.mak file and set FLAGS variable with paths to elfutils header and libraries (necessary if using a "local" version of elfutils)
For example:
FLAGS=-I/usr/local/elfutils/include -L/usr/local/elfutils/lib -O2 -march=native -Wl,-rpath /usr/local/elfutils/lib
build it by issuing make
optional: run tests with make check
optional: install binaries to enable use by others with make install

perfgrind's People

Contributors

Stargazers

Watchers

Forkers

arthurfait gitmensch

perfgrind's Issues

bug in pgcollect: `--` not handled correctly; overwriting executables

If pgcollect does not find and output file name it leaks to after -- which should always stop parsing.
Example:

result:

$> perfgrind/pgcollect -sF 8192 -- myprog myoption
Setting frequency to 8192
Going to profile process with PID 2561483: myoption
Can't exec new process: No such file or directory
Collection stopped.
Waked up 8 times
Sythetic mmap events: 0
Real mmap events: 0
Sample events: 0
Total 0 events written

and myprog being zero bytes afterwards (handled as output filename).

FR: GitHub metadata - Releases

To do so: go to https://github.com/ostash/perfgrind/tags, then start with the 0.1 version on the right side "..." button -> create release

Then choose release title, suggestion "perfgrind 0.1 - initial release", with some release note, suggestion:

* `pgcollect` records samples for `PERF_COUNT_HW_CPU_CYCLES` events in userspace only at frequency 1000;  
both attaching to existing process and spawning new process are supported
* `pgreport` converts collected samples to 'callgrind' format;  
only callgraphs are supported; detail is always 'source file/line' (when debug information is available); instruction dumping is not implemented yet."

then the same for the 0.2 tag, suggestion "perfgrind 0.2" with

* new completely rewritten converter `pgconvert`, old `pgreport` is removed
* flat/callgraph profiles are supported
* different levels of details are supported: object, symbol, source
* instruction dumping is supported
* resulting 'callgrind' files are much smaller now, thanks to grouping hits by object/symbol/source file/source line
* `pginfo` utility for getting some stats about dump file.

and for now last one "perfgrind 0.3" with

* allow to set arbitrary frequency in `pgcollect` via `-F` argument
* handling of hits in PLT added
* properly handle page offset in mmap events

again for an example how this would look like:

https://github.com/GitMensch/perfgrind/releases and https://github.com/GitMensch/perfgrind (on the right)

For 0.4 you can then try "generate release news" button ;-)

Can't create performance event file descriptor: No such file or directory

Testing on current Debian worked fine, testing on another machine (RHEL 8.5) raised this error.

perfgrind/pgcollect.c

Lines 307 to 314 in 9b1a75f

 int fd = perf_event_open(&pe_attr, pid, cpu, -1, 0); 

 if (fd == -1) 

 { 

 perror("Can't create performance event file descriptor"); 

 if (state->gogoFD != -1) 

 close(state->gogoFD); 

 exit(EXIT_FAILURE); 

 }

As perf record -o perf.data --call-graph dwarf,8192 -z worked fine I have no idea what to check, do you have any clue why that system call does not work here?

FR pgconvert: optional second filename to specify output file

... ideally also for pgcollect

FR: pgconvert from `perf record` created file

Ideally I'd like to pgconvert profiles coming directly from
LD_LIBRARY_PATH=. perf record -o perf.data --call-graph dwarf,8192 -z --aio --sample-cpu ./main.

Originally posted by @GitMensch in #1 (comment)

Is this possible or can be added? Either directly into pgconvert or as separate tool like pdatagconvert?

compiler warnings concerning formatting for 64bit integers

gcc -std=gnu99 -Wall -g -D_GNU_SOURCE -o pgcollect pgcollect.c
pgcollect.c: In function 'collectExistingMappings':
pgcollect.c:127:17: warning: format '%lx' expects argument of type 'long unsigned int *', but argument 3 has type '__u64 *' {aka 'long long unsigned int *'} [-Wformat=]
     sscanf(buf, "%"PRIx64"-%"PRIx64" %s %"PRIx64" %*x:%*x %*u %s\n", &event.addr, &event.len, prot, &event.pgoff,
                 ^~~                                                  ~~~~~~~~~~~
In file included from pgcollect.c:4:
/usr/include/inttypes.h:121:34: note: format string is defined here
 # define PRIx64  __PRI64_PREFIX "x"
pgcollect.c:127:17: warning: format '%lx' expects argument of type 'long unsigned int *', but argument 4 has type '__u64 *' {aka 'long long unsigned int *'} [-Wformat=]
     sscanf(buf, "%"PRIx64"-%"PRIx64" %s %"PRIx64" %*x:%*x %*u %s\n", &event.addr, &event.len, prot, &event.pgoff,
                 ^~~                                                               ~~~~~~~~~~
In file included from pgcollect.c:4:
/usr/include/inttypes.h:121:34: note: format string is defined here
 # define PRIx64  __PRI64_PREFIX "x"
pgcollect.c:127:17: warning: format '%lx' expects argument of type 'long unsigned int *', but argument 6 has type '__u64 *' {aka 'long long unsigned int *'} [-Wformat=]
     sscanf(buf, "%"PRIx64"-%"PRIx64" %s %"PRIx64" %*x:%*x %*u %s\n", &event.addr, &event.len, prot, &event.pgoff,
                 ^~~                                                                                 ~~~~~~~~~~~~
In file included from pgcollect.c:4:
/usr/include/inttypes.h:121:34: note: format string is defined here
 # define PRIx64  __PRI64_PREFIX "x"

Can we get rid of the 10+ year old branches?

0.1 and 0.2 are long gone...

Question: does this repo contain kernel code?

Mainly asking because of the license notice...

If yes: which?
If not: Would you consider to make this repo GPLv3+ or at least GPLv2+?

FR pgcollect: add `‐-max-size` to stop profiling early

‐limit-size: Stop profiling when the collected data file size (in MBs) crosses the limit n.

note: if/when #4 is done and enabled it should "of course" limit to the compressed size (if that is known to the application)

calling "very flat" callgraphs with bad function names

I've pulled current master, then built (no need for site.mak as dependencies were installed via apt in Debian), then followed the docs further (pgcollect with manually increased frequency, then running pgconvert) - but end up with a profile that has all entries named "func_HASH" and a total result of 148%.

should this still work and produce a callgrind file that contains reasonable function names and percentage?

external addon: script to combine events

This is possibly a quite "special" request: I'd like to combine events that match a prefix/suffix, which would be specified on command line.

By "combine" I mean that the events would be calculated as if they would have been seen in the caller:
Giving the following four stacktraces during the pgcollect run:

main
  func1
    func1_
       intfunc 

main
  func1
     intfunc2

main
  func2
     intfunc

main
  func2
     func2_

And a suggested call of pgconvert -d symbol -c "intfunc*,*_" filename.pgdata > callgrind.out.overfiew_pgdata
"-c func count as caller, may use an asterisk to match multiple ones"

The "inspected" call stacks would be (after combining everything that starts with "intfunc" or ends with "_"):

main
  func1

main
  func1

main
  func2

main
  func2

when combining the call all costs attached to this specific entry would be counted as happened in the caller... _ maybe_ that means the combination switch must be handled (and therefore specified) during recording already?

FR pgcollect: add `-a` "all cpus" option - or similar

The option to limit the recording (in perf only "by event" or "by user name/id") would be necessary, too (username/id would be fine, "by binary" would be extra cool, "by text match in command line" would be even better).

pgconvert could then use a single "logical" "collection" as root (if necessary) and then the PIDs as logical sub-notes - before adding the appropriate stack; pginfo would provide some details on the PIDs collected.

A feature like this is the one part that is missing most compared to "plain perf" or its rcording + conversion to callgrind format (which misses details but most important "eats CPU like an end-boss").

pgcollect: option to use PID and timestamp of started/attached process as part of the output file

Scenario:

there's a "runner script" already that executes the executable after setting up some environment variables; I've added an environment variable in front of it, that allows to "only" export the variable to pgcollect; while this works in general there's again the need to set the variable (in this case possibly the login script) and it is harder than necessary to set the a useful output file name...

Originally posted by @GitMensch in #23 (comment)

To solve this I'd like to have either replace options in the specified filename or extra flags to say "add the current timestamp and/or pid to the output name specified.

That would mean something like one of those:

pgcollect filename.pgdata -P-T -- command options
# would result in filename.pgdata.PidOfThatProcesss.CurrentTimeStamp
pgcollect filename.%T-%P.pgdata -- command options
# would result in filename.CurrentTimeStamp-PidOfThatProcesss.pgdata

(and should also work if -p PID is used)

FR: create debian package

This may be a "long term" issue, also given that the build itself is relative easy.

Nonetheless an "official" debian package is very likely to "reach out" to new users, also because of debian packages being a common "upstream" for debian-based distributions.

A point to start is possibly the "Debian Handbook" Learning to Make Packages.

Should PLT hits be counted "special"?

I've seen ab01a43 but am not sure how the "before" and "after" looks like (I know, I could checkout the old version and just test...), so this may or may not have be different in old versions.

In any case when inspecting generated callgrind files I see quite an amount of func@plt entries (which possibly only happens when tracing with a high frequency).
If traced via callgrind those function lookups are not shown, it seems that the collection via perfparser attributes some executions that are done in the real function in a plt version (but that would need a deeper check to verify).

Question: Should these be shown? I think the ideal version would be to count them as if they would have been finished with calling (= show the real function/module instead of the callers' func@plt version; but I'm not sure that this is even possible. If not, should they be directly attributed to the caller (which could be chosen by the user if #10 is implemented with --collect=*@plt) or do you consider the current version to be the most reasonable?

FR: GitHub metadata - "About"

Please go to the main page of this GitHub repo, then click right next to "about".

I suggest to use the following short description:

perfgrind - tools for collecting samples from Linux performance events subsystem and converting profiling data to callgrind format, allowing it to be read with KCachegrind

and following tags (you need to actually enter one by one):

linux-kernel perf profiling callgrind kcachegrind callgraph

and lastly disable "packages"

For the result see https://github.com/GitMensch/perfgrind on the right side.

FR: add `install` and `uninstall` targets

Should I handle that in #12?
It would only depend on the three binaries moved to an internal variable PROGRAMS and execute install $(PROGRAMS) $(PREFIX/)bin (and defaulting that to /usr/local).

The uninstall would then execute an rm in the same place.

Side note: some Makefiles copy the existing binaries during install to the build dir, then restore them on uninstall, but I think that's not needed here. Of course some programs drop the uninstall target completely...

PIE not handled properly in pgconvert

I've just cloned the repo and run make check which worked fine, then make open-checkfiles.

cc -std=gnu99  -O2 -Wall -Wextra -g  -D_GNU_SOURCE -o pgcollect  pgcollect.c
g++ -std=c++03 -O2 -Wall -Wextra -g  -o pgconvert  pgconvert.cpp AddressResolver.cpp Profile.cpp -ldw -lelf
g++ -std=c++03 -O2 -Wall -Wextra -g  -o pginfo     pginfo.cpp    AddressResolver.cpp Profile.cpp -ldw -lelf
g++ -std=c++03 -O  -Wall -Wextra -g  -g -fno-omit-frame-pointer -o pginfo_dbg pginfo.cpp    AddressResolver.cpp Profile.cpp -ldw -lelf

collecting some data of ls binary (likely without full symbols)...
./pgcollect check_ls.pgdata -F 2048 -s -- ls -l /usr/bin  1>/dev/null

collecting data of checking that (guaranteed to have symbols for binary pginfo_dbg) ...
./pgcollect check.pgdata    -F 2048 -s -- ./pginfo_dbg callgraph check_ls.pgdata
Setting frequency to 2048
Going to profile process with PID 30725: ./pginfo_dbg callgraph check_ls.pgdata
memory objects: 3
entries: 19

mmap events: 19
good sample events: 15
bad sample events: 0
total sample events: 15
total events: 34
Collection stopped.
Waked up 1 times
Sythetic mmap events: 0
Real mmap events: 43
Sample events: 6
Total 49 events written

checking its infos ...
./pginfo callgraph check.pgdata
memory objects: 5
entries: 14

mmap events: 43
good sample events: 6
bad sample events: 0
total sample events: 6
total events: 49

converting both collections to callgrind format ...
./pgconvert check_ls.pgdata -d object       1> check_ls.grind  # old: stdout
Can't resolve symbol for address 154a1, load base: 55f323480000
Can't resolve symbol for address 15d300, load base: 7feffd84a000
./pgconvert check.pgdata    -d source -i       check.grind     # new: second option

The result looks reasonable so far but opening on KCachegrind shows nearly an empty source and most checking of machine code prompts an error, which is rooted in the objdump calls that use the address from the profile - and those don't exist in the binary.

If I run those manually I get.

objdump -C -d --start-address=0x2F1E --stop-address=0x2F46 /home/build/perfgrind/pginfo_dbg

/home/build/perfgrind/pginfo_dbg:     file format elf64-x86-64

run without the address:

 objdump -C -d  /home/build/perfgrind/pginfo_dbg | head -n 20

/home/build/perfgrind/pginfo_dbg:     file format elf64-x86-64


Disassembly of section .init:

0000000000003000 <_init>:
    3000:       48 83 ec 08             sub    $0x8,%rsp
    3004:       48 8b 05 dd 8f 00 00    mov    0x8fdd(%rip),%rax        # bfe8 <__gmon_start__>
    300b:       48 85 c0                test   %rax,%rax
    300e:       74 02                   je     3012 <_init+0x12>
    3010:       ff d0                   callq  *%rax
    3012:       48 83 c4 08             add    $0x8,%rsp
    3016:       c3                      retq

Disassembly of section .plt:

0000000000003020 <.plt>:
    3020:       ff 35 e2 8f 00 00       pushq  0x8fe2(%rip)        # c008 <_GLOBAL_OFFSET_TABLE_+0x8>
    3026:       ff 25 e4 8f 00 00       jmpq   *0x8fe4(%rip)        # c010 <_GLOBAL_OFFSET_TABLE_+0x10

Any idea what's going on here?

FR: add `-z` option to pgconvert (compress on the fly via zstd)

similar to perf record:

       -z, --compression-level[=n]
           Produce compressed trace using specified level n (default: 1
           - fastest compression, 22 - smallest trace)

I think this would mean an (optional?) dependency on zlib and then "just" wrap the file output in a compression function using the level as specified / default and having the pgconvert tool using zlib to open the file.

The reason for this FR is that the CPU doing the profiling commonly has some spare cycles and reducing the data to write also leads to an io speedup (and long profiling always tends to fill the disk fast).

	int fd = perf_event_open(&pe_attr, pid, cpu, -1, 0);
	if (fd == -1)
	{
	perror("Can't create performance event file descriptor");
	if (state->gogoFD != -1)
	close(state->gogoFD);
	exit(EXIT_FAILURE);
	}