Comments (8)
Hi Einar,
LK tipped me off to this discussion. IMO it's useful to have the performance information stored in some persistified format (either a performance ntuple which gets written out at the end of a job and can be analyzed, or as another collection in the event record). However, I also think it's extremely useful to have both a summary and the option for detailed output in the job log file. This way someone debugging or running test jobs has easy access to the information. My main experience with this comes as a user of some ATLAS tools. Here is some relatively old documentation about them. Slides , proceedings . In trying to find that reference, I found this article which has a lot of references (although it's got a slightly different focus). Anyways, I hope this is helpful input!
from framework.
Here's an idea:
We do include the performance information within the event file if enabled, but it is kept in a separate TTree. This allows us to avoid copying around performance information when re-processing files, gives us a pretty simple interpretation of the performance information in the output file: it reflects the config that was used to generate that specific file, and it means we can define a new "meta-schema" for this performance TTree that is not restricted by the "meta-schema" already defined for the Events TTree. (This has the added benefit of de-coupling the schema evolution of the performance data from the schema-evolution of the event data.)
Performance TTree Meta-Schema
I'm imagining a pretty simple Meta-Schema where each processor in the sequence has a branch named after it where we can store the performance information. Maybe we define a new ROOT-serializable class
or struct
to store information we find interesting (like time stamps, run time, memory somehow) and then each branch is that object for each processor.
We could then include other branches for event-by-event, but not processor-specific data (like event processing time including all processors).
I'm unsure if restricting this TTree to be one-entry-per-event is too restrictive. I know that there is some performance data that is not event-by-event (e.g. total run time including init and de-init), but I think we could probably just have another object (perhaps a TTree) for storing that information. Then we would probably want to put this performance data into a subdirectory in the ROOT file to distinguish it from the event and run trees with actual data.
from framework.
A possible option would be to just run all the measurements in the process, make them accessible from a processor, and then you could in make a dedicated producer that handles writing the corresponding collection to the event if it is used. Otherwise, the measurements would just be discarded. That could potentially also let you do more exotic things if you wanted to
from framework.
This sounds really good. The one thing I would want is to make sure that it is possible for a processor to register additional measurements to make (thinking of simulator here but probably useful elsewhere too). Would it still be possible to make analyzers that would read the second tree?
I think starting out with just raw runtime is a good place to start, it's (relatively) straight-forward to do and try out some basic things
from framework.
We could add another processor callback (e.g. logPerformance
) that is only called when performance is requested. This could have a event-bus-like interface to the performance tree.
from framework.
Slight modification to my idea as well, the main location for instrumenting the performance is within the Process::run
function. This does not have easy handles to the output event file especially since it is accommodating the possibility of there being multiple output event files. For this reason, I think the performance data should be written to a specific directory in the histogram file p.histogramFile
which is always a single file for any single run of fire
and has direct handles within Process
.
I also think this is somewhat more natural since the histogram file has always been "extra" information that is derived from the event data. Performance data is in some sense "extra" as well.
With this in mind, I think a good idea is to have a specific class that isolates the performance tracking logic so that Process::run
doesn't get more cluttered (since it already is pretty cluttered). Then Process
would simply create a PerformanceTracker
if configured to do so which then has call-backs for specific points in the Process::run
logic. I outline the PerformanceTracker
API below since I don't want to take the time to make a compiling/running solution right now.
class PerformanceTracker {
// has some handle to the destination for the data
TDirectory *storage_directory_;
// has a TTree for event-by-event perf info
TTree *event_data_;
// some mechanism for buffering timestamps and other "in-process" measurements
// has some ROOT-serializable object for other info
SomeObject run_data_;
public:
// create it with the destination
// e.g. with Process::makeHistoDirectory("performance")
PerformanceTracker(TDirectory *storage_directory);
// destructor needs to make sure that the trees/objects are written
// so that Process can just delete it when closing
~PerformanceTracker();
/* begin list of callbacks for various points in Process::run */
void absolute_start(); // literally first line of Process::run
void absolute_end(); // literally last line of Process::run (only called when run compeltes without errors)
void begin_onProcessStart(); // before onProcessStart section
void end_onProcessStart(); // after onProcessStart section
void begin_onProcessStart(const std::string& processor); // before processor specific onProcessStart
void end_onProcessStart(const std::string& processor); // after processor specific onProcessStart
// similar callbacks for the different EventProcessor callbacks
};
This is a really messy solution but I can't think of another way to make it clear what is happening. We could do some preprocessor-macro and/or lambda-function nonsense to reduce the amount of code in PerformanceTracker
, but I fear that would simply make Process::run
harder to understand which I want to avoid.
from framework.
I've seen messier things to deal with so I'm not sure doing some preprocessor/lambda stuff is needed. The Process::run
function (at least according to me) is long yes but it is relatively straight-forward to read so I don't think your proposal here would make things much worse. If we are worried about the length of the Process::run
function, I think factoring out some distinct functions from it probably would deal with most of it.
from framework.
https://stackoverflow.com/a/64166
Looks like we can use some C boilerplate to get memory, CPU usage, and time at any given point. I will want to test how long it takes to actually do these measurements and see if the order matters at all before committing to all of them.
from framework.
Related Issues (20)
- Add Total Reset for NtupleManager
- Overlay event number randomization doesn't work
- Auto Python Bindings HOT 2
- Update skim rules to enable use of an OR for triggering HOT 3
- Isolate Framework into a Dependency HOT 5
- Some Exceptions cause seg-faults HOT 10
- Allow Python upgrade
- Conditions::getConditionPtr assigns a nullptr to a reference (UB) HOT 2
- Set Timestamp and Run Number from Producer HOT 1
- Crash when processing a small number of events when passing a large list of inputFiles HOT 2
- fire does not tolerate dots in config file name HOT 2
- Post error message after stacktrace
- Additional python debug output on missing __dict__
- log different event counts in event header HOT 2
- Repr nullptr-check can never trigger
- enableLogging macro not fully specifying namespace HOT 2
- Collection existence check fails if one collection name is a substring in another HOT 5
- format the code HOT 1
- more thorough Framework testing
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from framework.