#include <perfmon/pfmlib.h> #include <perfmon/pfmlib_itanium.h> int pfm_ita_is_ear(unsigned int i); int pfm_ita_is_dear(unsigned int i); int pfm_ita_is_dear_tlb(unsigned int i); int pfm_ita_is_dear_cache(unsigned int i); int pfm_ita_is_iear(unsigned int i); int pfm_ita_is_iear_tlb(unsigned int i); int pfm_ita_is_iear_cache(unsigned int i); int pfm_ita_is_btb(unsigned int i); int pfm_ita_support_opcm(unsigned int i); int pfm_ita_support_iarr(unsigned int i); int pfm_ita_support_darr(unsigned int i); int pfm_ita_get_event_maxincr(unsigned int i, unsigned int *maxincr); int pfm_ita_get_event_umask(unsigned int i, unsigned long *umask);
The Itanium specific functions presented here are mostly used to retrieve the characteristics of an event. Given a opaque event descriptor, obtained by pfm_find_event or its derivatives, they return a boolean value indicating whether this event support this features or is of a particular kind.
The pfm_ita_is_ear() function returns 1 if the event designated by i corresponds to a EAR event, i.e., an Event Address Register type of events. Otherwise 0 is returned. For instance, DATA_EAR_CACHE_LAT4 is an ear event, but CPU_CYCLES is not. It can be a data or instruction EAR event.
The pfm_ita_is_dear() function returns 1 if the event designated by i corresponds to an Data EAR event. Otherwise 0 is returned. It can be a cache or TLB EAR event.
The pfm_ita_is_dear_tlb() function returns 1 if the event designated by i corresponds to a Data EAR TLB event. Otherwise 0 is returned.
The pfm_ita_is_dear_cache() function returns 1 if the event designated by i corresponds to a Data EAR cache event. Otherwise 0 is returned.
The pfm_ita_is_iear() function returns 1 if the event designated by i corresponds to an instruction EAR event. Otherwise 0 is returned. It can be a cache or TLB instruction EAR event.
The pfm_ita_is_iear_tlb() function returns 1 if the event designated by i corresponds to an instruction EAR TLB event. Otherwise 0 is returned.
The pfm_ita_is_iear_cache() function returns 1 if the event designated by i corresponds to an instruction EAR cache event. Otherwise 0 is returned.
The pfm_ita_support_opcm() function returns 1 if the event designated by i supports opcode matching, i.e., can this event be measured accurately when opcode matching via PMC8/PMC9 is active. Not all events supports this feature.
The pfm_ita_support_iarr() function returns 1 if the event designated by i supports code address range restrictions, i.e., can this event be measured accurately when code range restriction is active. Otherwise 0 is returned. Not all events supports this feature.
The pfm_ita_support_darr() function returns 1 if the event designated by i supports data address range restrictions, i.e., can this event be measured accurately when data range restriction is active. Otherwise 0 is returned. Not all events supports this feature.
The pfm_ita_get_event_maxincr() function returns in maxincr the maximum number of occurrences per cycle for the event designated by i. Certain Itanium events can occur more than once per cycle. When an event occurs more than once per cycle, the PMD counter will be incremented accordingly. It is possible to restrict measurement when event occur more than once per cycle. For instance, NOPS_RETIRED can happen up to 6 times/cycle which means that the threshold can be adjusted between 0 and 5, where 5 would mean that the PMD counter would be incremented by 1 only when the nop instruction is executed more than 5 times/cycle. This function returns the maximum number of occurrences of the event per cycle, and is the non-inclusive upper bound for the threshold to program in the PMC register.
The pfm_ita_get_event_umask() function returns in umask the umask for the event designated by i.
When the Itanium specific features are needed to support a measurement their descriptions must be passed as model-specific input arguments to the pfm_dispatch_events call. The Itanium specific input arguments are described in the pfmlib_ita_input_param_t structure and the output parameters in pfmlib_ita_output_param_t. They are defined as follows:
typedef enum {
PFMLIB_ITA_ISM_BOTH=0,
PFMLIB_ITA_ISM_IA32=1,
PFMLIB_ITA_ISM_IA64=2
} pfmlib_ita_ism_t;
typedef struct {
unsigned int flags;
unsigned int thres;
pfmlib_ita_ism_t ism;
} pfmlib_ita_counter_t;
typedef struct {
unsigned char opcm_used;
unsigned long pmc_val;
} pfmlib_ita_opcm_t;
typedef struct {
unsigned char btb_used;
unsigned char btb_tar;
unsigned char btb_tac;
unsigned char btb_bac;
unsigned char btb_tm;
unsigned char btb_ptm;
unsigned char btb_ppm;
unsigned int btb_plm;
} pfmlib_ita_btb_t;
typedef enum {
PFMLIB_ITA_EAR_CACHE_MODE= 0,
PFMLIB_ITA_EAR_TLB_MODE = 1,
} pfmlib_ita_ear_mode_t;
typedef struct {
unsigned char ear_used;
pfmlib_ita_ear_mode_t ear_mode;
pfmlib_ita_ism_t ear_ism;
unsigned int ear_plm;
unsigned long ear_umask;
} pfmlib_ita_ear_t;
typedef struct {
unsigned int rr_plm;
unsigned long rr_start;
unsigned long rr_end;
} pfmlib_ita_input_rr_desc_t;
typedef struct {
unsigned long rr_soff;
unsigned long rr_eoff;
} pfmlib_ita_output_rr_desc_t;
typedef struct {
unsigned int rr_flags;
pfmlib_ita_input_rr_desc_t rr_limits[4];
unsigned char rr_used;
} pfmlib_ita_input_rr_t;
typedef struct {
unsigned int rr_nbr_used;
pfmlib_ita_output_rr_desc_t rr_infos[4];
pfmlib_reg_t rr_br[8];
} pfmlib_ita_output_rr_t;
typedef struct {
pfmlib_ita_counter_t pfp_ita_counters[PMU_ITA_NUM_COUNTERS];
unsigned long pfp_ita_flags;
pfmlib_ita_opcm_t pfp_ita_pmc8;
pfmlib_ita_opcm_t pfp_ita_pmc9;
pfmlib_ita_ear_t pfp_ita_iear;
pfmlib_ita_ear_t pfp_ita_dear;
pfmlib_ita_btb_t pfp_ita_btb;
pfmlib_ita_input_rr_t pfp_ita_drange;
pfmlib_ita_input_rr_t pfp_ita_irange;
} pfmlib_ita_input_param_t;
typedef struct {
pfmlib_ita_output_rr_t pfp_ita_drange;
pfmlib_ita_output_rr_t pfp_ita_irange;
} pfmlib_ita_output_param_t;
The Itanium processor provides two additional per-event features for counters: thresholding and instruction set selection. They can be set using the pfp_ita_counters data structure for each event. The ism field can be initialized as follows:
If ism has a value of zero, it will default to PFMLIB_ITA_ISM_BOTH.
The thres indicates the threshold for the event. A threshold of n means that the counter will be incremented by one only when the event occurs more than n times per cycle.
The flags field contains event-specific flags. The currently defined flags are:
The pfp_ita_pmc8 and pfp_ita_pmc9 fields of type pfmlib_ita_opcm_t contain the description of what to do with the opcode matchers. Itanium supports opcode matching via PMC8 and PMC9. When this feature is used the opcm_used field must be set to 1, otherwise it is ignored by the library. The pmc_val simply contains the raw value to store in PMC8 or PMC9. The library does not modify the values for PMC8 and PMC9, they will be stored in the pfp_pmcs table of the generic output parameters.
The pfp_ita_iear field of type pfmlib_ita_ear_t describes what to do with instruction Event Address Registers (I-EARs). Again if this feature is used the ear_used must be set to 1, otherwise it will be ignored by the library. The ear_mode must be set to either one of PFMLIB_ITA_EAR_TLB_MODE, PFMLIB_ITA_EAR_CACHE_MODEto indicate the type of EAR to program. The umask to store into PMC10 must be in ear_umask. The privilege level mask at which the I-EAR will be monitored must be set in ear_plm which can be any combination of PFM_PLM0, PFM_PLM1, PFM_PLM2, PFM_PLM3. If ear_plm is 0 then the default privilege level mask in pfp_dfl_plm is used. Finally the instruction set for which to monitor is in ear_ism and can be any one of PFMLIB_ITA_ISM_BOTH, PFMLIB_ITA_ISM_IA32, or PFMLIB_ITA_ISM_IA64.
The pfp_ita_dear field of type pfmlib_ita_ear_t describes what to do with data Event Address Registers (D-EARs). The description is identical to the I-EARs except that it applies to PMC11.
In general, there are four different methods to program the EAR (data or instruction):
There are 4 methods to program the BTB and they are as follows:
Range restriction is implemented using the debug registers. There is a limited number of debug registers and they go in pair. With 8 data debug registers, a maximum of 4 distinct ranges can be specified. The same applies to code range restrictions. Moreover, there are some severe constraints on the alignment and size of the range. Given that the size range is specified using a bitmask, there can be situations where the actual range is larger than the requested range. The library will make the best effort to cover only what is requested. It will never cover less than what is requested. The algorithm uses more than one pair of debug registers to get a more precise range if necessary. Hence, up to the 4 pairs can be used to describe a single range. The library returns the start and end offsets of the actual range compared to the requested range.
If range restriction is to be used, the rr_used field must be set to one, otherwise settings will be ignored. The ranges are described by the pfmlib_ita2_input_rr_t structure. Up to 4 ranges can be defined. Each range is described in by a entry in rr_limits.
The pfmlib_ita2_input_rr_desc_t structure is defined as follows:
The library will provide the values for the debug registers as well as some information about the actual ranges in the output parameters and more precisely in the pfmlib_ita2_output_rr_t structure for each range. The structure is defined as follows: