Comments (13)
There's also a regression here in terms of memory usage:
sos-2.2
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27219 root 20 0 299m 50m 6212 S 25.9 5.1 0:01.10 sosreport
git HEAD
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17964 root 20 0 298m 95m 4264 R 97.4 9.6 0:11.05 sosreport
I'm assuming for now that this is down to in-memory tarball generation - on some hosts I've seen RSS climb as high as 200M which is a factor of four inflation of previous versions.
It might be that this is acceptable given the benefits of the single-pass tar approach but it's something I think we need to keep an eye on and solicit some feedback for.
from sos.
Ah this is interesting, though not unexpected. I'm sure that we can trim the fat in a couple places but we'll never reach the old footprint so long as we spool to an archive. I also suspect that loading every single plugin in to memory isn't helping much either.
Maybe we should try an actual profiler to get a very good picture?
from sos.
Ah this is interesting, though not unexpected. I'm sure that we can trim the fat
in a couple places but we'll never reach the old footprint so long as we spool
to an archive. I also suspect that loading every single plugin in to memory
isn't helping much either.Maybe we should try an actual profiler to get a very good picture?
I'd be inclined to do both - an external profiler is great for developer
tuning but having something in there to record interval times is helpful
for cases where we have reports of very long runtimes out in the field.
It means we can get useful data off the bat without troubling users with
installing python profiling tools.
from sos.
fwiw my worst results by far are currently on F17. It seems to take forever to run there (and spits lots of errors due to files in /sys that advertise read perms yet cannot be read...).
It always seems to take ~5m or so with the default plugin set.
from sos.
I've added a commit in master that profiles around the regex substitution methods in commit 96323c8.
I've also got a branch (bmr-sosreport-profiler) that adds profiling to the main sosreport.py. This turns up some interesting results:
copied: /var/log/rhsm/rhsm.log time: 0.017336
copied: /var/log/rhsm/rhsmcertd.log time: 0.002668
output: /usr/bin/yum -C repolist time: 2.998306
output: subscription-manager list --installed time: 1.527252
output: subscription-manager list --consumed time: 1.139404
done : copy_stuff time: 34.405739
subst : /root/anaconda-ks.cfg time: 12.580447
done : postproc time: 12.583144
I knew repolist, and s-m were slow so no surprise there. Copy stuff is not too far off either but 12.5s to do a one-line regex sub? This would explain the much worse result I get on Fedora - every time we do the regex sub we're closing, opening, reading (to extract), closing, opening and then extending the tar file with the new content so the longer run times are another consequence of Issue #86.
from sos.
Running this on F17 gives:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12540 root 20 0 408m 211m 3660 R 99.0 5.6 1:54.02 sosreport
Your sosreport has been generated and saved in:
/tmp/sosreport-hex.usersys.redhat.com-20121212210616.tar.xz
The checksum is: 1afc89e521aa975ab3eed08e550c9a50
Please send this file to your support representative.
real 4m16.628s
user 2m59.156s
sys 0m7.533s
done : copy_stuff time: 65.143614
subst : /root/anaconda-ks.cfg time: 13.416147
subst : /etc/libvirt/qemu/winxp-vm0.xml time: 12.962496
subst : /etc/libvirt/qemu/win7vm-1.xml time: 12.985993
subst : /etc/libvirt/qemu/rhel6-vm1.xml time: 13.220372
subst : /etc/libvirt/qemu/u1210-vm1.xml time: 13.156413
subst : /etc/libvirt/qemu/rhel5vm-1.xml time: 13.303305
subst : /etc/libvirt/qemu/rhel6-vm2.xml time: 13.613682
subst : /etc/libvirt/qemu/rhel6-vm3.xml time: 13.191365
subst : /etc/libvirt/qemu/rhel5vm-0.xml time: 13.389542
subst : /etc/libvirt/qemu/rhel7-vm1.xml time: 13.291241
done : postproc time: 132.536304
So over 2m12 of the 4m17 is spent in postproc averaging over 13s/file.
from sos.
That's still over a minute of time to explain but I'm assuming that's in the tar creation itself - that code's also not profiled at the moment.
from sos.
Ug. That is disgusting. This is more than enough reason to start over on the way we archive the report.
from sos.
I'm sold :)
This isn't a huge report either:
$ ll -h sosreport-hex.usersys.redhat.com-20121212210616.tar.xz
-rw-r--r--. 1 root root 3.3M Dec 12 21:09 sosreport-hex.usersys.redhat.com-20121212210616.tar.xz
$ du -ch sosreport-hex.usersys.redhat.com-20121212210616 | tail -1
84M total
from sos.
Oh, and then there's this:
$ tar tf sosreport-hex.usersys.redhat.com-20121212210616.tar.xz | wc -l
33517
$ tar tf sosreport-hex.usersys.redhat.com-20121212210616.tar.xz | sort | uniq | wc -l
6000
That one's my fault - the code to make sure we have proper tar headers for each parent directory of a collected path isn't checking for dupes.
from sos.
Fixing just that dramatically improves the memory use on RHEL - about 26M peak RSS against 50.
from sos.
Even bigger difference on F17:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18865 root 20 0 251m 54m 3724 R 98.6 1.4 0:20.56 sosreport
Your sosreport has been generated and saved in:
/tmp/sosreport-hex.usersys.redhat.com-20121212221728.tar.xz
The checksum is: 54b8cd033c980a52620b64afafe0ddfc
Please send this file to your support representative.
real 2m39.813s
user 1m10.207s
sys 0m5.643s
from sos.
Closing this out as the perf. problems with streaming tar archives were resolved by the move to FileCacheArchive and the remaining static profiling support was removed in commit 4553f09 (Issue #244).
from sos.
Related Issues (20)
- plugin for Kubeflow HOT 6
- add /run/mount/utab HOT 2
- [ubuntu][microk8s] Extend kubernetes options to microk8s plugin HOT 2
- [plugin] Helm with helm list HOT 6
- implement PROXMOX / PVE plugin HOT 2
- [RFE] Extend `sos clean` to have an option for environment variables HOT 4
- Planning release sos-4.7.0 HOT 12
- Juju collector returning 127, not finding nodes
- foreman tests are failing HOT 2
- [Ubuntu] sos collect is throwing a deprecation warning when verifying the sos package version
- parse_version fails on Ubuntu HOT 3
- Drop any SCL related code? HOT 2
- [cleaner] unable to set a parser to skip obfuscating a (destination of) symlink
- 4.7.0: pytest fails because missing `sos_tests` module and deprecation waning HOT 3
- -plugin-timeout does not wait long enoufg HOT 4
- [Ubuntu] Stage 1 and Stage 2 errors due to python 3.12 being introduced HOT 14
- [RFE] Add rootless container log collection to podman plugin HOT 2
- [ubuntu] `msr` module being loaded while running a basic sosreport HOT 1
- [RFE] speedup journal export HOT 5
- [RFE] collect kubernetes informations for kubeadm installs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sos.