Giter VIP home page Giter VIP logo

gptl's Introduction

This file contains information about using GPTL. For information on building
and installing GPTL, see the file INSTALL.

GPTL is the "General Purpose Timing Library". It can be used to manually
instrument application codes with an arbitrary set of "regions" (or "timers")
over which statistics such as wallclock time and CPU time are gathered and
subsequently printed. If the target application is built with the GNU
compilers (gcc or gfortran), Pathscale (pathcc or pathf95), or PGI compilers,
GPTL can also be used to automatically instrument regions which are defined
by function entry and exit points. This is an easy way to generate a dynamic
call tree. See Auto-Instrumentation below for a description of how to use
this feature.

Similar to compiler-generated auto-instrumentation, GPTL can intercept and
auto-profile MPI calls made by the application if the target MPI library
supports the PMPI profiling layer. In this case an estimate of bytes
transferred by each MPI call is presented in the printed output.

If the PAPI library is installed (http://icl.cs.utk.edu/papi), GPTL
also provides a convenient mechanism to access all available PAPI events. In
addtion to PAPI preset and native events, GPTL defines derived events which
are based on PAPI counters. See gptl.h for a list of available derived events.
Of course these events can only be enabled if the PAPI counters they require
are available on the target architecture.

Installing GPTL
---------------

To see all the available configure options:

./configure --help

To build and install GPTL:

./configure --prefix=/use/local
make check
make install

Set environmental variables CPPFLAGS and LDFLAGS to any additional
directories you wish to check for include files or library files.

Using GPTL 
----------

C codes making GPTL library calls should #include <gptl.h>. Fortran codes can
"use gptl" or #include or Fortran include 'gptl.inc'. The C and Fortran
interfaces are identical, except that the C interface uses mixed case. All
user-accessible functions return either 0 (success) or -1 (failure). Example
codes that use the library can be found in subdirectories ctests/ and
ftests/.

Code instrumentation to utilize GPTL involves zero or more calls to
GPTLsetoption(), then a single call to GPTLinitialize(), then an arbitrary
sequence of calls to GPTLstart() and GPTLstop(), and finally a call to
GPTLpr() or GPTLpr_file(). See "Example" below for a sample calling
sequence. Calls to GPTLstart() and GPTLstop() are thread-safe, with per-thread
statistics printed by GPTLpr() or GPTLpr_file().

The purpose of GPTLsetoption() is to enable or disable various library
options. For example, to enable the PAPI counter for total cycles, do this:

ret = GPTLsetoption (PAPI_TOT_CYC, 1);

The "1" says "enable". Use "0" for "disable". See the man pages for complete
documentation on function usage and arguments. The list of available GPTL
options is contained in gptl.h, and a complete list of available PAPI-based
events can be found by running "ctests/avail".

GPTLinitialize() initializes the GPTL library.

There can be an arbitrary number of start/stop pairs before GPTLpr() or
GPTLpr_file() is called to print the results. And an arbitrary amount of
nesting of regions is also allowed. The printed results will be indented to
indicate the level of nesting for each region.

GPTLpr() prints the results to a file named timing.<number>, where the single
argument <number> is an integer. For MPI jobs, it is most convenient to use
the MPI rank of the invoking task for <number>. Equivalently, function
GPTLpr_file() can be called. Its input argument is a character string
indicating the output file name to be written. It is up to the user to ensure
that these print functions write to uniquely-named files, in order to avoid
name-space collisions.

GPTLfinalize() can be called to clean up the GPTL environment.  All space
malloc'ed by the GPTL library will be freed by this call.


Example 
-------

From "man GPTLstart", a simple example calling sequence to time a couple of
code regions and print the results is:

(void) GPTLsetoption (GPTLcpu, 1);      /* enable cpu timings */
(void) GPTLsetoption (GPTLwall, 0);     /* disable wallclock timings */
(void) GPTLsetoption (PAPI_TOT_CYC, 1); /* enable counting of total cycles */
...
(void) GPTLinitialize();                /* initialize the GPTL library */
(void) GPTLstart ("total");             /* start a timer */
...
(void) GPTLstart ("do_work");           /* start another timer */

do_work();                              /* do some work */

(void) GPTLstop ("do_work");            /* stop a timer */
(void) GPTLstop ("total");              /* stop a timer */
...
(void) GPTLpr (mympitaskid);            /* print the results to timing.<mympitaskid> */


Auto-instrumentation 
--------------------

If the regions to be timed are defined by function entry and exit points, and
the application to be profiled is built with either the GNU or Pathscale
compilers, you might find it convenient to use the auto-instrumentation
feature of GPTL. Here's how:

1) Add the flag -finstrument-functions (-Minstrument:functions under PGI)
when compiling the routines you'd like to profile.

2) Add calls to GPTLsetoption() (if desired), and GPTLinitialize() to the main
program before any other routines are invoked.

3) Add a call to GPTLpr() or GPTLpr_file() wherever appropriate prior to where
the code terminates.

4) Link with -lgptl (and -lpapi if PAPI is enabled).

5) Run the code.

6) Run "hex2name.pl <a.out> <timing.0> | less", where
<a.out> is the name of the executable, and <timing.0> is the name of the
timing file to be converted.

The result should be a dynamic call tree with timings and (if enabled) PAPI
counts and derived event statistics for each region, where regions are defined
by function entry and exit points.

Here's what's happening under the covers:

The -finstrument-functions flag tells the compiler to insert calls to
__cyg_profile_func_enter (void *this_fn, void *call_site) at function start,
and __cyg_profile_func_exit (void *this_fn, void *call_site) at function
exit. GPTL defines these functions as calls to (effectively) GPTLstart() and
GPTLstop(), where the address of the function is used as the input sentinel to
these routines.

Running hex2name.pl converts the function addresses back to human-readable
function names. It uses the UNIX "nm" utility to do this.

When using MPI auto-profiling, steps 2) and 3) above can be omitted. In this
case GPTL auto-generates calls to GPTLinitialize and GPTLpr from MPI_Init and
MPI_finalize, respectively.

Multi-processor instrumented codes 
----------------------------------

With rev. 4.3 of GPTL, function GPTLpr_summary(mpi_communicator) was
rewritten from scratch for scalability and the presentation of additional
statistical information.  Max, min, mean, and standard deviation of region
timings, along with the process and thread index responsible for max and min,
are presented in a single output file named timing.summary. With this
rewrite, this is now the preferred method (over parsegptlout.pl) for
gathering summary statistics across threads and tasks.  See example3 in the
web documentation for further information.

gptl's People

Contributors

edhartnett avatar jmrosinski avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.