Reporting seems to be in a funny state at the moment. We have the old HTML and XML rep

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Reporting review about sos HOT 29 CLOSED

sosreport commented on June 25, 2024

Reporting review

from sos.

Comments (29)

jhjaggars commented on June 25, 2024

I never removed the xml or html reports because they were so far down on the list, but I agree with you. I think there might be some value in implementing an HTML concrete class that uses the reporting stuff and dumping the old things.

RE: inverting --report, I think it's a good idea. I think that there is an issue around here somewhere that I never got around to to make --report on by default.

from sos.

adam-stokes commented on June 25, 2024

What about dumping a json file with certain metadata that external tools could use for their own reporting? Not saying rip out the existing reporting (well maybe xml b/c its just ugh) but something in addition to whats there.

from sos.

bmr-cymru commented on June 25, 2024

This is the idea with SOMA (Sos object model archive) - making the archive more discoverable and presenting the data in an abstracted fashion. Discussions about this have been going on since $forever with little actual movement.

from sos.

adam-stokes commented on June 25, 2024

This is a pretty aggressive time slot for resolving this bug but ill try to get it done by 3.1

from sos.

adam-stokes commented on June 25, 2024

@bmr-cymru could we setup an irc meeting to discuss how we want to tackle SOMA and also the dbus interface.

Thanks!

from sos.

adam-stokes commented on June 25, 2024

For the html output generation should we use a template library like cheetah or jinja? Or are we thinking we should manually create the HTML and elements within a HTML report type class?

from sos.

bmr-cymru commented on June 25, 2024

Moving this to 3.3 as nothing is broken by it and we don't have time to get anything new in for 3.2.

from sos.

prayther commented on June 25, 2024

augtool dump-xml /files > /tmp/augtool_dump_xml_all_files.xml

augtool, could maybe help with the lenses that have already been created ???

just a thought.

from sos.

bmr-cymru commented on June 25, 2024

Not really (we've looked at Augeas several times; if we are to use it it'll be via the Python API):
dumping yet-another-cryptic file in an awkward encoding (XML) into the reports does not help anyone.

If we address this it needs to be in a manner that's readily consumable and doesn't just layer on more inconvenience.

Anyone who wants augeas-formatted XML for an sosreport can easily get it right now by just pointing the tool at a report archive.

from sos.

Amitgb14 commented on June 25, 2024

@bmr-cymru @battlemidget anybody work on this issue?

from sos.

adam-stokes commented on June 25, 2024

Not yet

from sos.

Amitgb14 commented on June 25, 2024

I would like to work on this issue, I share what point in my mind.

First report is generate in json format and write temporary file inside /tmp directory.
Create reporting directory and put html_report.py, xml_report.py and plaintext_report.py scripts
and finally generate report in sos.html, sos.txt, sos.xml and sos.json format.

Any suggestions, please share.

from sos.

bmr-cymru commented on June 25, 2024

First report is generate in json format and write temporary file inside /tmp directory.

Nack; there is no need for this. All the data to be reported is in-memory. Writing it to disk and then reading it back and writing it again is pointless make-work.

Create reporting directory and put html_report.py, xml_report.py and plaintext_report.py scripts

Nack (unless I mis-understood): why do these need to be external scripts? The current project structure uses python modules to assemble various subsystems that interact via defined interfaces. The only time we use an exec() style of interface is when interacting with truly external components (e.g. commands run by plugins or during policy loading and evaluation).

and finally generate report in sos.html, sos.txt, sos.xml and sos.json format.

This is an admirable goal to work toward but I do not think it depends on either point (1) or (2).

from sos.

Amitgb14 commented on June 25, 2024

Point first 1) : large size of data is not efficient store in main-memory, so it's need to write temporary file and then get back read, reading data required only when writing report(.html, txt and xml).

Point second 2) : reporting scripts can be easily manage and reduce sosreport.py script size, In future developer can easily change report look structure.(example, developer want to change html style or plain text style then it do make easy) and also reduce complexity.

from sos.

bmr-cymru commented on June 25, 2024

: large size of data is not efficient store in main-memory

It is already there - look at the current reporting code. It iterates over the set of loaded plugins and interrogates them for the data to be stored in the report fields. If you are making a case that that repetitive formatting (for XML, HTML, text, etc.) is inefficient that is a different argument and one that I don't see is solved by merely writing the JSON data out to disk.

reporting scripts can be easily manage and reduce sosreport.py script size,

So would abstracting this out into sos/report.py (and if necessary xmlreport.py, jsonreport.py etc.). This would also drive UP the memory and IO costs that you seem concerned about - each script will start as a new process with a brand new address space. If we are lucky then shared data may reside in the pagecache but if that is then read in anew by those processes we are unlikely to benefit from sharing unless we use complex IO models like memory-mapping (not at all easy in Python).

from sos.

bmr-cymru commented on June 25, 2024

I think a good first step would be to move all the still-desired reporting functionality out of sosreport.py and into the current report.py - deleting the legacy report code at the same time and re-implementing it using Jesse's classes where it makes sense.

This would help to ensure the interfaces we have are sane and workable and de-clutters the main sosreport.py (another very worthy goal).

I think at this stage making any design decision on the basis of presumed performance improvements is a mistake - Knuth is right - "premature optimisation is the root of all evil". There are known parts of sos that have very suboptimal memory usage right now but the reporting code is certainly not one that I lose any sleep over (PackageManager is a different matter for e.g...).

from sos.

Amitgb14 commented on June 25, 2024

ohh it's my bad about first point

from sos.

Amitgb14 commented on June 25, 2024

If i get wrong please correct this : We add xmlreport.py, htmlreport.py as module don't need to call extra process, inside sosreport.py

from sos.

bmr-cymru commented on June 25, 2024

We add xmlreport.py, htmlreport.py as module don't need to call extra process, inside sosreport.py

Right - I think for now this is the best approach. It keeps to existing project conventions and it would be a big improvement in the code structure and maintainability. If at the end of all that work there are measurable performance concerns then we can look at optimisations like caching or writing data to the file system.

from sos.

Amitgb14 commented on June 25, 2024

ok 👍 ..

from sos.

Amitgb14 commented on June 25, 2024

@bmr-cymru, I write small web application to list out and browse the reports.
https://github.com/Amitgb14/sosweb

from sos.

Amitgb14 commented on June 25, 2024

Is there any update?

from sos.

TurboTurtle commented on June 25, 2024

Cycling around on this, just dealt with a situation where a sosreport took over 4 hours to run, with the vast majority of that time (3+ hours) spent on generating the reports. I think the reason this happened was the shear volume of files that the sosreport created due to it being run on a heavily utilized OCP node - there were just shy of 114k files in the archive.

That is a lot, but is it really expected to take 3 hours at that volume, or is this indicative of a lower level issue? Also, what consumes the html and xml reports today? Would it be beneficial to dynamically set reporting to be on or off based how large the sosreport is by the time we finish running the plugins?

from sos.

bmr-cymru commented on June 25, 2024

there were just shy of 114k files in the archive.

Do we know why there was such a volume? I.e. is this sane, either in terms of the node configuration, or what we are attempting to collect?

from sos.

TurboTurtle commented on June 25, 2024

It was a fairly heavily used OCP node. 150 running containers, another 130 stopped, and a total of 1100 images on it. All the docker plugin bits on that but more importantly the cgroups plugin grabbing /sys/fs/cgroup/* bits for the kubernetes pods which is where the bulk of this came from:

$ find sys/fs/cgroup/ -type f | wc -l
88516

from sos.

TurboTurtle commented on June 25, 2024

Sorry, that didn't actually answer your question. The volume would be sane for the size of the OpenShift environment it was on, but that is probably in the upper-end of such environments. So I imagine there are other end users running into similarly long run times and just "dealing with it" at the moment.

from sos.

bmr-cymru commented on June 25, 2024

By biggest problem with reports is it kinda feels like it should be post-processable. We should be able to take an archive, and comprehend it to produce that output, entirely independently of the collection host (it's just pretty printing, effectively).

That way we could turn it off by default and let users do something like:

    $ sos report --html --from sosreport-blah-blah.tar.gz

(or whatever)

from sos.

TurboTurtle commented on June 25, 2024

Since 2018, we've overhauled the actual reports generation mechanisms. A previous informal survey on the RH side also showed that while HTML reports are not ubiquitously used they are consumed to some degree. Given those two points, I wonder if this can be closed?

Or is the post-processing suggestion above still desirable?

@bmr-cymru @pmoravec

from sos.

pmoravec commented on June 25, 2024

+1 to close this. The HTML report generation was re-written in #1728, no issues since then.

from sos.

Reporting review about sos HOT 29 CLOSED

Comments (29)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent