C Interface
#include <papi.h>
int PAPI_profil(void * buf, unsigned bufsiz, unsigned long offset,
unsigned scale, int EventSet, int EventCode, int threshold,
int flags);
Fortran Interface
The profiling routines have no Fortran interface.
*buf -- pointer to a buffer of
bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or
unsigned long long values, or buckets. The size of the buckets is determined by values in the
flags argument.
bufsiz -- the size of the histogram buffer in bytes. It is computed from the length of the code region to be
profiled, the size of the buckets, and the scale factor as discussed below.
offset -- the start address of the region to be profiled.
scale -- broadly and historically speaking, a contraction factor that indicates how much
smaller the histogram buffer is than the region to be profiled. More
precisely, scale is interpreted as an unsigned 16-bit fixed-point
fraction with the decimal point implied on the left. Its value is the
reciprocal of the number of addresses in a subdivision, per counter of
histogram buffer. Below is a table of representative values for scale:
Representative values for the scale variable
|
HEX | DECIMAL | DEFININTION |
0x20000 | 131072 | Maps precisely one instruction address to a unique bucket in buf. |
0x10000 | 65536 | Maps precisely two instruction addresses to a unique bucket in buf. |
0xFFFF | 65535 | Maps approximately two instruction addresses to a unique bucket in buf. |
0x8000 | 32768 | Maps every four instruction addresses to a bucket in buf. |
0x4000 | 16384 | Maps every eight instruction addresses to a bucket in buf. |
0x0002 | 2 | Maps all instruction addresses to the same bucket in buf. |
0x0001 | 1 | Undefined. |
0x0000 | 0 | Undefined. |
Historically, the scale factor was introduced to allow the allocation of buffers smaller than
the code size to be profiled. Data and instruction sizes were assumed to be multiples of 16-bits.
These assumptions are no longer necessarily true.
PAPI_profil has preserved the traditional definition of scale where appropriate,
but deprecated the definitions for 0 and 1 (disable scaling) and extended
the range of scale to include 65536 and 131072 to allow for exactly two
addresses and exactly one address per profiling bucket.
The value of bufsiz is computed as follows:
bufsiz = (end - start)*(bucket_size/2)*(scale/65536) where
bufsiz - the size of the buffer in bytes
end, start - the ending and starting addresses of the profiled region
bucket_size - the size of each bucket in bytes; 2, 4, or 8 as defined in
flags
scale - as defined above
EventSet -- The PAPI EventSet to profile. This EventSet is marked as profiling-ready, but profiling
doesnt actually start until a
PAPI_start() call is issued.
EventCode -- Code of the Event in the EventSet to profile. This event must already be a member of the EventSet.
threshold -- minimum number of events that must occur before the PC is sampled. If hardware overflow
is supported for your substrate, this threshold will trigger an interrupt when reached.
Otherwise, the counters will be sampled periodically and the PC will be recorded for the
first sample that exceeds the threshold. If the value of threshold is 0, profiling will be
disabled for this event.
flags -- bit pattern to control profiling behavior. Defined values are shown in the table below:
Defined bits for the flags variable
|
PAPI_PROFIL_POSIX | Default type of profiling, similar to
|
PAPI_PROFIL_RANDOM | Drop a random 25% of the samples.
|
PAPI_PROFIL_WEIGHTED | Weight the samples by their value.
|
PAPI_PROFIL_COMPRESS | Ignore samples as values in the hash buckets get big.
|
PAPI_PROFIL_BUCKET_16 | Use unsigned short (16 bit) buckets, This is the default bucket.
|
PAPI_PROFIL_BUCKET_32 | Use unsigned int (32 bit) buckets.
|
PAPI_PROFIL_BUCKET_64 | Use unsigned long long (64 bit) buckets.
|
PAPI_PROFIL_FORCE_SW | Force software overflow in profiling.
|
int retval;
unsigned long length;
PAPI_exe_info_t *prginfo;
unsigned short *profbuf;
if ((prginfo = PAPI_get_executable_info()) == NULL)
handle_error(1);
length = (unsigned long)(prginfo->text_end - prginfo->text_start);
profbuf = (unsigned short *)malloc(length);
if (profbuf == NULL)
handle_error(1);
memset(profbuf,0x00,length);
.
.
.
if ((retval = PAPI_profil(profbuf, length, start, 65536, EventSet,
PAPI_FP_INS, 1000000, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16)) != PAPI_OK)
handle_error(retval);