Giter VIP home page Giter VIP logo

Comments (13)

bmr-cymru avatar bmr-cymru commented on July 4, 2024

There's also a regression here in terms of memory usage:

sos-2.2

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
27219 root      20   0  299m  50m 6212 S 25.9  5.1   0:01.10 sosreport

git HEAD

PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+ COMMAND                                                                                                            
17964 root      20   0  298m  95m 4264 R  97.4  9.6   0:11.05 sosreport

I'm assuming for now that this is down to in-memory tarball generation - on some hosts I've seen RSS climb as high as 200M which is a factor of four inflation of previous versions.

It might be that this is acceptable given the benefits of the single-pass tar approach but it's something I think we need to keep an eye on and solicit some feedback for.

from sos.

jhjaggars avatar jhjaggars commented on July 4, 2024

Ah this is interesting, though not unexpected. I'm sure that we can trim the fat in a couple places but we'll never reach the old footprint so long as we spool to an archive. I also suspect that loading every single plugin in to memory isn't helping much either.

Maybe we should try an actual profiler to get a very good picture?

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

Ah this is interesting, though not unexpected. I'm sure that we can trim the fat
in a couple places but we'll never reach the old footprint so long as we spool
to an archive. I also suspect that loading every single plugin in to memory
isn't helping much either.

Maybe we should try an actual profiler to get a very good picture?

I'd be inclined to do both - an external profiler is great for developer
tuning but having something in there to record interval times is helpful
for cases where we have reports of very long runtimes out in the field.
It means we can get useful data off the bat without troubling users with
installing python profiling tools.

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

fwiw my worst results by far are currently on F17. It seems to take forever to run there (and spits lots of errors due to files in /sys that advertise read perms yet cannot be read...).

It always seems to take ~5m or so with the default plugin set.

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

I've added a commit in master that profiles around the regex substitution methods in commit 96323c8.

I've also got a branch (bmr-sosreport-profiler) that adds profiling to the main sosreport.py. This turns up some interesting results:

copied: /var/log/rhsm/rhsm.log                                                      time: 0.017336
copied: /var/log/rhsm/rhsmcertd.log                                                 time: 0.002668
output: /usr/bin/yum -C repolist                                                    time: 2.998306
output: subscription-manager list --installed                                       time: 1.527252
output: subscription-manager list --consumed                                        time: 1.139404
done  : copy_stuff                                                                  time: 34.405739
subst : /root/anaconda-ks.cfg                                                       time: 12.580447
done  : postproc                                                                    time: 12.583144

I knew repolist, and s-m were slow so no surprise there. Copy stuff is not too far off either but 12.5s to do a one-line regex sub? This would explain the much worse result I get on Fedora - every time we do the regex sub we're closing, opening, reading (to extract), closing, opening and then extending the tar file with the new content so the longer run times are another consequence of Issue #86.

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

Running this on F17 gives:

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                            
12540 root      20   0  408m 211m 3660 R 99.0  5.6   1:54.02 sosreport 
Your sosreport has been generated and saved in:
  /tmp/sosreport-hex.usersys.redhat.com-20121212210616.tar.xz

The checksum is: 1afc89e521aa975ab3eed08e550c9a50

Please send this file to your support representative.


real    4m16.628s
user    2m59.156s
sys 0m7.533s
done  : copy_stuff                                                                  time: 65.143614
subst : /root/anaconda-ks.cfg                                                       time: 13.416147
subst : /etc/libvirt/qemu/winxp-vm0.xml                                             time: 12.962496
subst : /etc/libvirt/qemu/win7vm-1.xml                                              time: 12.985993
subst : /etc/libvirt/qemu/rhel6-vm1.xml                                             time: 13.220372
subst : /etc/libvirt/qemu/u1210-vm1.xml                                             time: 13.156413
subst : /etc/libvirt/qemu/rhel5vm-1.xml                                             time: 13.303305
subst : /etc/libvirt/qemu/rhel6-vm2.xml                                             time: 13.613682
subst : /etc/libvirt/qemu/rhel6-vm3.xml                                             time: 13.191365
subst : /etc/libvirt/qemu/rhel5vm-0.xml                                             time: 13.389542
subst : /etc/libvirt/qemu/rhel7-vm1.xml                                             time: 13.291241
done  : postproc                                                                    time: 132.536304

So over 2m12 of the 4m17 is spent in postproc averaging over 13s/file.

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

That's still over a minute of time to explain but I'm assuming that's in the tar creation itself - that code's also not profiled at the moment.

from sos.

jhjaggars avatar jhjaggars commented on July 4, 2024

Ug. That is disgusting. This is more than enough reason to start over on the way we archive the report.

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

I'm sold :)

This isn't a huge report either:

$ ll -h sosreport-hex.usersys.redhat.com-20121212210616.tar.xz
-rw-r--r--. 1 root root 3.3M Dec 12 21:09 sosreport-hex.usersys.redhat.com-20121212210616.tar.xz
$ du -ch sosreport-hex.usersys.redhat.com-20121212210616 | tail -1
84M total

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

Oh, and then there's this:

$ tar tf sosreport-hex.usersys.redhat.com-20121212210616.tar.xz | wc -l
33517
$ tar tf sosreport-hex.usersys.redhat.com-20121212210616.tar.xz | sort | uniq | wc -l
6000

That one's my fault - the code to make sure we have proper tar headers for each parent directory of a collected path isn't checking for dupes.

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

Fixing just that dramatically improves the memory use on RHEL - about 26M peak RSS against 50.

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

Even bigger difference on F17:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                            
18865 root      20   0  251m  54m 3724 R 98.6  1.4   0:20.56 sosreport
Your sosreport has been generated and saved in:
  /tmp/sosreport-hex.usersys.redhat.com-20121212221728.tar.xz

The checksum is: 54b8cd033c980a52620b64afafe0ddfc

Please send this file to your support representative.


real    2m39.813s
user    1m10.207s
sys 0m5.643s

from sos.

bmr-cymru avatar bmr-cymru commented on July 4, 2024

Closing this out as the perf. problems with streaming tar archives were resolved by the move to FileCacheArchive and the remaining static profiling support was removed in commit 4553f09 (Issue #244).

from sos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.