Comments (29)
I never removed the xml or html reports because they were so far down on the list, but I agree with you. I think there might be some value in implementing an HTML concrete class that uses the reporting stuff and dumping the old things.
RE: inverting --report, I think it's a good idea. I think that there is an issue around here somewhere that I never got around to to make --report on by default.
from sos.
What about dumping a json file with certain metadata that external tools could use for their own reporting? Not saying rip out the existing reporting (well maybe xml b/c its just ugh) but something in addition to whats there.
from sos.
This is the idea with SOMA (Sos object model archive) - making the archive more discoverable and presenting the data in an abstracted fashion. Discussions about this have been going on since $forever with little actual movement.
from sos.
This is a pretty aggressive time slot for resolving this bug but ill try to get it done by 3.1
from sos.
@bmr-cymru could we setup an irc meeting to discuss how we want to tackle SOMA and also the dbus interface.
Thanks!
from sos.
For the html output generation should we use a template library like cheetah or jinja? Or are we thinking we should manually create the HTML and elements within a HTML report type class?
from sos.
Moving this to 3.3 as nothing is broken by it and we don't have time to get anything new in for 3.2.
from sos.
augtool dump-xml /files > /tmp/augtool_dump_xml_all_files.xml
augtool, could maybe help with the lenses that have already been created ???
just a thought.
from sos.
Not really (we've looked at Augeas several times; if we are to use it it'll be via the Python API):
dumping yet-another-cryptic file in an awkward encoding (XML) into the reports does not help anyone.
If we address this it needs to be in a manner that's readily consumable and doesn't just layer on more inconvenience.
Anyone who wants augeas-formatted XML for an sosreport can easily get it right now by just pointing the tool at a report archive.
from sos.
@bmr-cymru @battlemidget anybody work on this issue?
from sos.
Not yet
from sos.
I would like to work on this issue, I share what point in my mind.
- First report is generate in json format and write temporary file inside /tmp directory.
- Create reporting directory and put html_report.py, xml_report.py and plaintext_report.py scripts
- and finally generate report in sos.html, sos.txt, sos.xml and sos.json format.
Any suggestions, please share.
from sos.
- First report is generate in json format and write temporary file inside /tmp directory.
Nack; there is no need for this. All the data to be reported is in-memory. Writing it to disk and then reading it back and writing it again is pointless make-work.
- Create reporting directory and put html_report.py, xml_report.py and plaintext_report.py scripts
Nack (unless I mis-understood): why do these need to be external scripts? The current project structure uses python modules to assemble various subsystems that interact via defined interfaces. The only time we use an exec()
style of interface is when interacting with truly external components (e.g. commands run by plugins or during policy loading and evaluation).
- and finally generate report in sos.html, sos.txt, sos.xml and sos.json format.
This is an admirable goal to work toward but I do not think it depends on either point (1) or (2).
from sos.
Point first 1) : large size of data is not efficient store in main-memory, so it's need to write temporary file and then get back read, reading data required only when writing report(.html, txt and xml).
Point second 2) : reporting scripts can be easily manage and reduce sosreport.py script size, In future developer can easily change report look structure.(example, developer want to change html style or plain text style then it do make easy) and also reduce complexity.
from sos.
- : large size of data is not efficient store in main-memory
It is already there - look at the current reporting code. It iterates over the set of loaded plugins and interrogates them for the data to be stored in the report fields. If you are making a case that that repetitive formatting (for XML, HTML, text, etc.) is inefficient that is a different argument and one that I don't see is solved by merely writing the JSON data out to disk.
- reporting scripts can be easily manage and reduce sosreport.py script size,
So would abstracting this out into sos/report.py
(and if necessary xmlreport.py
, jsonreport.py
etc.). This would also drive UP the memory and IO costs that you seem concerned about - each script will start as a new process with a brand new address space. If we are lucky then shared data may reside in the pagecache but if that is then read in anew by those processes we are unlikely to benefit from sharing unless we use complex IO models like memory-mapping (not at all easy in Python).
from sos.
I think a good first step would be to move all the still-desired reporting functionality out of sosreport.py
and into the current report.py
- deleting the legacy report code at the same time and re-implementing it using Jesse's classes where it makes sense.
This would help to ensure the interfaces we have are sane and workable and de-clutters the main sosreport.py
(another very worthy goal).
I think at this stage making any design decision on the basis of presumed performance improvements is a mistake - Knuth is right - "premature optimisation is the root of all evil". There are known parts of sos that have very suboptimal memory usage right now but the reporting code is certainly not one that I lose any sleep over (PackageManager is a different matter for e.g...).
from sos.
ohh it's my bad about first point
from sos.
If i get wrong please correct this : We add xmlreport.py, htmlreport.py as module don't need to call extra process, inside sosreport.py
from sos.
We add xmlreport.py, htmlreport.py as module don't need to call extra process, inside sosreport.py
Right - I think for now this is the best approach. It keeps to existing project conventions and it would be a big improvement in the code structure and maintainability. If at the end of all that work there are measurable performance concerns then we can look at optimisations like caching or writing data to the file system.
from sos.
ok 👍 ..
from sos.
@bmr-cymru, I write small web application to list out and browse the reports.
https://github.com/Amitgb14/sosweb
from sos.
Is there any update?
from sos.
Cycling around on this, just dealt with a situation where a sosreport took over 4 hours to run, with the vast majority of that time (3+ hours) spent on generating the reports. I think the reason this happened was the shear volume of files that the sosreport created due to it being run on a heavily utilized OCP node - there were just shy of 114k files in the archive.
That is a lot, but is it really expected to take 3 hours at that volume, or is this indicative of a lower level issue? Also, what consumes the html and xml reports today? Would it be beneficial to dynamically set reporting to be on or off based how large the sosreport is by the time we finish running the plugins?
from sos.
- there were just shy of 114k files in the archive.
Do we know why there was such a volume? I.e. is this sane, either in terms of the node configuration, or what we are attempting to collect?
from sos.
It was a fairly heavily used OCP node. 150 running containers, another 130 stopped, and a total of 1100 images on it. All the docker plugin bits on that but more importantly the cgroups plugin grabbing /sys/fs/cgroup/* bits for the kubernetes pods which is where the bulk of this came from:
$ find sys/fs/cgroup/ -type f | wc -l
88516
from sos.
Sorry, that didn't actually answer your question. The volume would be sane for the size of the OpenShift environment it was on, but that is probably in the upper-end of such environments. So I imagine there are other end users running into similarly long run times and just "dealing with it" at the moment.
from sos.
By biggest problem with reports is it kinda feels like it should be post-processable. We should be able to take an archive, and comprehend it to produce that output, entirely independently of the collection host (it's just pretty printing, effectively).
That way we could turn it off by default and let users do something like:
$ sos report --html --from sosreport-blah-blah.tar.gz
(or whatever)
from sos.
Since 2018, we've overhauled the actual reports generation mechanisms. A previous informal survey on the RH side also showed that while HTML reports are not ubiquitously used they are consumed to some degree. Given those two points, I wonder if this can be closed?
Or is the post-processing suggestion above still desirable?
from sos.
+1 to close this. The HTML report generation was re-written in #1728, no issues since then.
from sos.
Related Issues (20)
- [flatpak] Missing flatpak versions break sos execution HOT 1
- RFE: run systemd-analyze on various 'targets' HOT 4
- [component] report ignores --tmp-dir from preset
- Adding preset leaves its temp directory
- sos report masking not working if we not used following plugins - host, login, networking. HOT 1
- [slurm] Could not open conf file /etc/slurm/slurm.conf because the slurmd is running in `Configless` mode. HOT 3
- [transport] saltstack causes a final error trying to parse the package list from down nodes
- Sos stores upload passwords in the sos.log and manifest.json files. HOT 10
- sos [report|collect] can leak upload user password to stdout
- [snap] boto version is pinned
- Obtain CNI files for containerd HOT 2
- can't collect KUB stats, plugin can't be used HOT 20
- The sos-4.6.1.tar.gz seems to reference sos-4.6.0 HOT 5
- plugin for Kubeflow HOT 6
- add /run/mount/utab HOT 2
- [ubuntu][microk8s] Extend kubernetes options to microk8s plugin HOT 2
- [plugin] Helm with helm list HOT 6
- implement PROXMOX / PVE plugin HOT 2
- [RFE] Extend `sos clean` to have an option for environment variables HOT 4
- Planning release sos-4.7.0 HOT 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sos.