Giter VIP home page Giter VIP logo

Comments (8)

latompkins avatar latompkins commented on August 19, 2024 1

Hi Einar,

LK tipped me off to this discussion. IMO it's useful to have the performance information stored in some persistified format (either a performance ntuple which gets written out at the end of a job and can be analyzed, or as another collection in the event record). However, I also think it's extremely useful to have both a summary and the option for detailed output in the job log file. This way someone debugging or running test jobs has easy access to the information. My main experience with this comes as a user of some ATLAS tools. Here is some relatively old documentation about them. Slides , proceedings . In trying to find that reference, I found this article which has a lot of references (although it's got a slightly different focus). Anyways, I hope this is helpful input!

from framework.

tomeichlersmith avatar tomeichlersmith commented on August 19, 2024 1

Here's an idea:

We do include the performance information within the event file if enabled, but it is kept in a separate TTree. This allows us to avoid copying around performance information when re-processing files, gives us a pretty simple interpretation of the performance information in the output file: it reflects the config that was used to generate that specific file, and it means we can define a new "meta-schema" for this performance TTree that is not restricted by the "meta-schema" already defined for the Events TTree. (This has the added benefit of de-coupling the schema evolution of the performance data from the schema-evolution of the event data.)

Performance TTree Meta-Schema

I'm imagining a pretty simple Meta-Schema where each processor in the sequence has a branch named after it where we can store the performance information. Maybe we define a new ROOT-serializable class or struct to store information we find interesting (like time stamps, run time, memory somehow) and then each branch is that object for each processor.

We could then include other branches for event-by-event, but not processor-specific data (like event processing time including all processors).

I'm unsure if restricting this TTree to be one-entry-per-event is too restrictive. I know that there is some performance data that is not event-by-event (e.g. total run time including init and de-init), but I think we could probably just have another object (perhaps a TTree) for storing that information. Then we would probably want to put this performance data into a subdirectory in the ROOT file to distinguish it from the event and run trees with actual data.

from framework.

EinarElen avatar EinarElen commented on August 19, 2024

A possible option would be to just run all the measurements in the process, make them accessible from a processor, and then you could in make a dedicated producer that handles writing the corresponding collection to the event if it is used. Otherwise, the measurements would just be discarded. That could potentially also let you do more exotic things if you wanted to

from framework.

EinarElen avatar EinarElen commented on August 19, 2024

This sounds really good. The one thing I would want is to make sure that it is possible for a processor to register additional measurements to make (thinking of simulator here but probably useful elsewhere too). Would it still be possible to make analyzers that would read the second tree?

I think starting out with just raw runtime is a good place to start, it's (relatively) straight-forward to do and try out some basic things

from framework.

tomeichlersmith avatar tomeichlersmith commented on August 19, 2024

We could add another processor callback (e.g. logPerformance) that is only called when performance is requested. This could have a event-bus-like interface to the performance tree.

from framework.

tomeichlersmith avatar tomeichlersmith commented on August 19, 2024

Slight modification to my idea as well, the main location for instrumenting the performance is within the Process::run function. This does not have easy handles to the output event file especially since it is accommodating the possibility of there being multiple output event files. For this reason, I think the performance data should be written to a specific directory in the histogram file p.histogramFile which is always a single file for any single run of fire and has direct handles within Process.

I also think this is somewhat more natural since the histogram file has always been "extra" information that is derived from the event data. Performance data is in some sense "extra" as well.

With this in mind, I think a good idea is to have a specific class that isolates the performance tracking logic so that Process::run doesn't get more cluttered (since it already is pretty cluttered). Then Process would simply create a PerformanceTracker if configured to do so which then has call-backs for specific points in the Process::run logic. I outline the PerformanceTracker API below since I don't want to take the time to make a compiling/running solution right now.

class PerformanceTracker {
  // has some handle to the destination for the data
  TDirectory *storage_directory_;
  // has a TTree for event-by-event perf info
  TTree *event_data_;
  // some mechanism for buffering timestamps and other "in-process" measurements
  // has some ROOT-serializable object for other info
  SomeObject run_data_;
 public:
  // create it with the destination
  // e.g. with Process::makeHistoDirectory("performance")
  PerformanceTracker(TDirectory *storage_directory);
  // destructor needs to make sure that the trees/objects are written
  // so that Process can just delete it when closing
  ~PerformanceTracker();
  /* begin list of callbacks for various points in Process::run */
  void absolute_start(); // literally first line of Process::run
  void absolute_end(); // literally last line of Process::run (only called when run compeltes without errors)
  void begin_onProcessStart(); // before onProcessStart section
  void end_onProcessStart(); // after onProcessStart section
  void begin_onProcessStart(const std::string& processor); // before processor specific onProcessStart
  void end_onProcessStart(const std::string& processor); // after processor specific onProcessStart
  // similar callbacks for the different EventProcessor callbacks
};

This is a really messy solution but I can't think of another way to make it clear what is happening. We could do some preprocessor-macro and/or lambda-function nonsense to reduce the amount of code in PerformanceTracker, but I fear that would simply make Process::run harder to understand which I want to avoid.

from framework.

EinarElen avatar EinarElen commented on August 19, 2024

I've seen messier things to deal with so I'm not sure doing some preprocessor/lambda stuff is needed. The Process::run function (at least according to me) is long yes but it is relatively straight-forward to read so I don't think your proposal here would make things much worse. If we are worried about the length of the Process::run function, I think factoring out some distinct functions from it probably would deal with most of it.

from framework.

tomeichlersmith avatar tomeichlersmith commented on August 19, 2024

https://stackoverflow.com/a/64166

Looks like we can use some C boilerplate to get memory, CPU usage, and time at any given point. I will want to test how long it takes to actually do these measurements and see if the order matters at all before committing to all of them.

from framework.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.