canonical / etrace Goto Github PK
View Code? Open in Web Editor NEWUtility for tracing execution of apps
License: GNU General Public License v3.0
Utility for tracing execution of apps
License: GNU General Public License v3.0
$ git rev-parse HEAD
d97d3d1571dc7c21cb673321ec90a27baa37dc72
$ git log -n1
commit d97d3d1571dc7c21cb673321ec90a27baa37dc72 (HEAD -> master, origin/master, origin/HEAD)
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Tue Mar 1 03:03:13 2022 +0000
build(deps): bump actions/setup-go from 2.2.0 to 3
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 2.2.0 to 3.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](https://github.com/actions/setup-go/compare/v2.2.0...v3)
---
updated-dependencies:
- dependency-name: actions/setup-go
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <[email protected]>
$ /home/mbana/go/bin/etrace run -s -d gnome-calculator
2023/02/10 17:25:18 could not set process exec attr to unconfined: write /proc/self/attr/exec: invalid argument
$ cat /etc/os-release
NAME="Fedora Linux"
VERSION="37 (Workstation Edition)"
ID=fedora
VERSION_ID=37
VERSION_CODENAME=""
PLATFORM_ID="platform:f37"
PRETTY_NAME="Fedora Linux 37 (Workstation Edition)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:37"
DEFAULT_HOSTNAME="fedora"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f37/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=37
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=37
SUPPORT_END=2023-11-14
VARIANT="Workstation Edition"
VARIANT_ID=workstation
Currently, if a program issues a syscall like this:
20563 1592353877.755357 symlink("some-link", "/home/user/snap/chromium/1193/.config/chromium/SingletonLock") = 0
then we don't correctly track that the resultant file created is $CWD/some-link, because we only see "some-link" in the syscall. There are a few other syscalls like this that work with the AT_FDCWD special value as well that we are not resolving properly again due to the fact that we don't track current working directory changes.
The code to do this is not hard, we would probably just have to have a regexp that catches the chdir syscall and keep track of it as the loop progress through every syscall, and then also have another specific regexp that matches the set of syscalls that use this pattern as well as any syscall that uses AT_FDCWD with a non-absolute path as another syscall argument, and then do the replacement.
There's a related and similar problem about tracking mount namespace changes and chroots where we want to follow the changes into the mount namespace to see that a strictly confined snap accessing /lib/x86_64-linux-gnu/libc-2.27.so
is really accessing the base snap's file i.e. /snap/core18/current/lib/x86_64-linux-gnu/libc-2.27.so
alan@robot:~$ etrace analyze-snap antstream-arcade
[sudo] password for alan:
original snap size: 280.74 MiB
original compression format is xz
content snap slot dependencies: []
exit status 1
The antstream-arcade window opens, and stays open.
There are pre-existing tools for doing same as what this does, albeit not so focused on getting single, and potentially misleading, number for application startup. First application window appearing isn't necessarily any indication that application is ready to interact with the user, alhtough that is as at least as relevant startup metric for the user, as the app window appearance.
XResponse tool was designed to measure X applications startup and other user interaction times: https://www.freedesktop.org/wiki/Software/xresponse/
Maemo version has several improvements on top of that: https://github.com/maemo-tools-old/xresponse
I think there was also some tool(s) that measured startup time user interaction timings using toolkit introspection / accessibility features.
PS. This could have picked a more original name. Just Googling "etrace github" returns many tools called etrace, and at least one of them (written decade ago) was even ptrace based like this is...
PPS. File tool should mention also how many times given file is opened/read. Sometimes apps redundantly parse files many times.
xdotool is unmaintained and doesn't work with XWayland on Wayland as per @bboozzoo
Suggestion was to investigate xdo instead: https://github.com/baskerville/xdo
Telegram Desktop registers a url / mime type so telegram URLS open with the app. If you use etrace to analyze the snap, you'll find a popup appears and the test hangs.
Steps to reproduce:
etrace analyze-snap telegram-desktop
Expect telegram to open.
What actually happens is a popup appears.
alan@robot:~$ etrace analyze-snap telegram-desktop
original snap size: 306.85 MiB
original compression format is lzo
content snap slot dependencies: [gtk-common-themes]
exit status 1
Having to post process output like Total startup time: 1m10.027260157s
is not great.
Would be really helpful to either provide an option to specify the time unit or force the time to always be returned in seconds.
e.g:
fmt.Fprintln(w, "Total startup time:", startup.Seconds())
When the command specified as input is unambiguously a snap command/snap name, we shouldn't require explicitly setting --use-snap-run, i.e. with only libreoffice installed I see this:
$ etrace exec --clean-snap-user-data --reinstall-snap --discard-snap-ns --no-trace libreoffice
cannot use --discard-snap-ns without --use-snap-run
Currently, we verify that the window is X11, but if you are running headless without any session, you see a confusing message like this:
error: graphical session type is unsupported, only x11 is supported
notice the double spaces between type and is
Currently all the logic for waiting for windows to appear is X11 specific
Sometimes, when running large tests like the below script in a terminal window in VS code, VS code will eventually be crashed:
#!/bin/bash -ex
datadir=data-testrun-$RANDOM
mkdir $datadir
numAddtIterations=19
for s in gnome-calculator chromium supertuxkart libreoffice; do
outFile=$datadir/$s-vanilla-startup-notrace.json
if ! test -f "$outFile.done"; then
./etrace --additional-iterations=$numAddtIterations run \
--prepare-script=prepare-snap.sh \
--prepare-script-args=$s \
--prepare-script-args=$HOME/git/etrace/snap-repository/vanilla/$s.snap \
--output-file=$outFile \
--no-trace \
--use-snap-run \
--discard-snap-ns \
--json \
$s
touch "$outFile.done"
fi
done
# silly ones that need to have a window name specified
for s in mari0 test-snapd-glxgears; do
case $s in
mari0)
WINDOW_NAME=Mari0;;
test-snapd-glxgears)
WINDOW_NAME=glxgears;;
esac
outFile=$datadir/$s-vanilla-startup-notrace.json
if ! test -f "$outFile.done"; then
./etrace --additional-iterations=$numAddtIterations run \
--prepare-script=prepare-snap.sh \
--prepare-script-args=$s \
--prepare-script-args=$HOME/git/etrace/snap-repository/vanilla/$s.snap \
--window-name=$WINDOW_NAME \
--output-file=$outFile \
--no-trace \
--use-snap-run \
--discard-snap-ns \
--json \
$s
touch "$outFile.done"
fi
done
Nothing in dmesg or system logs, snapd didn't crash and there was no weird tasks in snapd state either. The crash reported seems to have been triggered, but it didn't do anything, just skipped uploading a previous upload.
The tracing package currently is the most complicated part and has a minimal amount of tests, we should add more tests there and try to refactor it to be easier to follow. Specifically the regular expressions need a lot of tests to make sure they match what we expect them to match and that they don't match what we expect them to not match.
We should support detecting content interface connections from snaps and reinstall those as well, since those can slow down snaps as well due to decompression by the kernel. The logic would look basically like:
if slot.Snap != "system" {
snapshotSnap(slot.Snap)
removeSnap(slot.Snap)
snapsToInstall = append(snapsToInstall,slot.Snap)
}
This is in a fully up-to-date groovy (Ubuntu 20.10) amd64 VM:
ubuntu@groovyvm:~$ etrace file --use-snap-run chromium
2020/10/23 21:34:34 xdotool.go:84:
strace-log-merge: /tmp/file-trace438846678/strace.log: strace output not found
2020/10/23 21:34:34 file-tracing.go:296:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x52ba47]
goroutine 1 [running]:
github.com/anonymouse64/etrace/internal/strace.(ExecvePaths).Display(0x0, 0x5dcaa0, 0xc0000ce000)
/build/etrace/parts/etrace/src/internal/strace/file-tracing.go:173 +0x37
main.(cmdFile).Execute(0x6ea080, 0xc00005ed20, 0x0, 0x3, 0x0, 0x0)
/build/etrace/parts/etrace/src/cmd/etrace/cmd_file.go:263 +0xfd0
github.com/jessevdk/go-flags.(Parser).ParseArgs(0xc00008b730, 0xc000010090, 0x3, 0x3, 0x0, 0x0, 0x0, 0x6, 0xc00007e380)
/root/go/pkg/mod/github.com/jessevdk/[email protected]/parser.go:333 +0x8c3
github.com/jessevdk/go-flags.(Parser).Parse(...)
/root/go/pkg/mod/github.com/jessevdk/[email protected]/parser.go:190
main.main()
/build/etrace/parts/etrace/src/cmd/etrace/main.go:95 +0x19c
On the other hand, etrace exec chromium
works as expected in the same VM.
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.
These updates have all been created already. Click a checkbox below to force a retry/rebase of any.
.github/workflows/go.yml
actions/setup-go v3
actions/checkout v2.4.0
.github/workflows/naming.yml
actions/checkout v2
.github/workflows/snap.yml
actions/checkout v2.4.0
snapcore/action-build v1.0.9
actions/upload-artifact v2
actions/download-artifact v2
go.mod
go 1.13
github.com/jessevdk/go-flags v1.4.1-0.20180927143258-7309ec74f752@7309ec74f752
github.com/snapcore/snapd v0.0.0-20210726143858-26a7ab7b6a92@26a7ab7b6a92
golang.org/x/net v0.0.0-20210405180319-a5a99cb37ef4@a5a99cb37ef4
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15@41f04d3bba15
While analyze-snap is super useful for comparing the result of changing the compression algorithm from xz to lzo, there is another comparison which could be useful. Compare an xz/lzo snap with the non-snap package startup time. This shows the difference between the snap and non-snap package could highlight further debugging if necessary.
Perhaps optionally as:
etrace analyze-snap vlc --non-snap /usr/bin/vlc
I'd expect the same 20 runs of the application to be done after the xz/lzo runs.
Could happen with:
etrace exec -s -d -t gnome-characters
Since it only exits successfully with:
etrace exec -s -d -t gnome-characters -w Smiley
if -n
was used better stop at the first iteration.
We currently have a lot of duplication of the implementation of many options that are shared between the exec
and file
subcommands. We should try to unify the implementation of these as much as possible, including the main loop of iterations and handling options.
When a user has both a deb and a snap installed, i.e. has the following setup:
$ which -a vlc
/usr/bin/vlc
/bin/vlc
/snap/bin/vlc
etrace should do the right thing and figure out that since there are variants of the snap on $PATH that are not actually a snap, we need to pass explicitly --use-snap-run
too.
theoretically flatpaks should work the same as native apps, but there may be warts here
We can actually make --delete-snap-user-data much more user-friendly by creating a snapshot before all iterations and then restoring it at the end after all iterations and deleting the snapshot. That means that during each run, we still don't have the data, but we also don't leave the user with a bunch of useless snapshots at the end that they have to dig through to understand and restore their data.
alan@robot:~$ etrace analyze-snap standard-notes
[sudo] password for alan:
original snap size: 63.93 MiB
original compression format is xz
content snap slot dependencies: [gnome-3-28-1804 gtk-common-themes]
exit status 1
No further information. I removed the snap before running this, as I did use it, but saved a snapshot and removed first, and it never gets past this.
Using snap aliases with snap run is currently not supported, observe:
$ snap install test-snapd-glxgears
$ snap run test-snapd-glxgears
error: cannot find app "test-snapd-glxgears" in "test-snapd-glxgears"
$ snap alias test-snapd-glxgears.glxgears test-snapd-glxgears
error: cannot perform the following tasks:
- Setup manual alias "test-snapd-glxgears" => "glxgears" for snap "test-snapd-glxgears" (cannot enable alias "test-snapd-glxgears" for "test-snapd-glxgears", it conflicts with the command namespace of installed snap "test-snapd-glxgears")
$ snap alias test-snapd-glxgears.glxgears glxgears
Added:
- test-snapd-glxgears.glxgears as glxgears
$ snap run glxgears
error: cannot find current revision for snap glxgears: readlink /snap/glxgears/current: no such file or directory
The only way to currently do this is to specify the /snap/bin/ program and omit --use-snap-run
.
It would be useful when debugging programs like electron which are using many different kinds of syscalls to filter by what file was accessed, and show the syscall even if that syscall failed with EPERM or something.
Using etrace from edge channel.
alan@robot:~$ etrace analyze-snap intellij-idea-ultimate
unable to install snap intellij-idea-ultimate and analyze: exit status 1
I assume this is because it's a classic snap, because I can't analyze code or datagrip either, both classic.
Using both --cmd-stderr=/dev/null --cmd-stdout=/dev/null
already helps but that's not enough as etrace itself can print msg on stderr. It would be great to have a way to only output the total startup time.
$ etrace exec -s -d -t --cmd-stderr=/dev/null --cmd-stdout=/dev/null vlc -n 5
Total startup time: 2.552092291s
2021/03/30 17:08:18 xdotool.go:91: X Error of failed request: BadValue (integer parameter out of range for operation)
Major opcode of failed request: 113 (X_KillClient)
Value in failed request: 0x440000a
Serial number of failed request: 18
Current serial number in output stream: 20
2021/03/30 17:08:18 main.go:120:
Total startup time: 2.548430002s
2021/03/30 17:08:20 xdotool.go:91: X Error of failed request: BadValue (integer parameter out of range for operation)
Major opcode of failed request: 113 (X_KillClient)
Value in failed request: 0x440000a
Serial number of failed request: 18
Current serial number in output stream: 20
2021/03/30 17:08:20 main.go:120:
Total startup time: 2.554694062s
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.