Skip to content

Commit

Permalink
i#6643: Add finalize_interval_snapshots API to analysis_tool_t (#6664)
Browse files Browse the repository at this point in the history
Adds a new finalize_interval_snapshots API to analysis_tool_t. This is
invoked with the list of shard-local interval state snapshots for each
shard separately in the parallel mode, or the whole-trace ones in serial
mode. This allows the tool the opportunity to make any required holistic
adjustments to the snapshots since now all of the snapshots can be
observed together; e.g., computing diffs with the prior snapshot. This
is invoked before the shard-local snapshots are possibly combined to
create whole-trace snapshots, and before the snapshots are passed to
print_interval_results.

Adds unit tests for the new API to the existing
tool.drcacheoff.trace_interval_analysis_unit_tests tests.

Refactors some existing code to accumulate the interval snapshots in an
std::vector instead of an std::queue. This adds some more complexity to
the merge_shard_interval_results implementation, but is better because
now we have more usages where an std::vector is needed (and we want to
avoid a back-and-forth conversion between a queue and a vector).

Augments various documentation to provide more details about intended
usages of the interval APIs. Notably: documented the new
finalize_interval_snapshots API, and that modifications made after the
combine_interval_snapshots API has been invoked do not have any effect).

Issue: #6643, #6020
  • Loading branch information
abhinav92003 authored Feb 22, 2024
1 parent 4bf1163 commit 41b55f2
Show file tree
Hide file tree
Showing 5 changed files with 227 additions and 120 deletions.
24 changes: 14 additions & 10 deletions api/docs/release.dox
Original file line number Diff line number Diff line change
Expand Up @@ -142,11 +142,11 @@ changes:
refers to timestamps and direct switches, which is what most users should want.
- Rename the macro INSTR_CREATE_mul_sve to INSTR_CREATE_mul_sve_imm to
differentiate it from the other SVE MUL instructions.
- Added a new drmemtrace analyzer option \p -interval_instr_count that enables trace
analyzer interval results for every given count of instrs in each shard. This mode
does not support merging the shard interval snapshots to output the whole-trace
interval snapshots. Instead, the print_interval_results() API is called separately
for each shard with the interval state snapshots of that shard.
- Renamed a protected data member in #dynamorio::drmemtrace::analyzer_tmpl_t from
merged_interval_snapshots_ to whole_trace_interval_snapshots_ (may be relevant for
users sub-classing analyzer_tmpl_t).
- Converted #dynamorio::drmemtrace::analysis_tool_tmpl_t::interval_state_snapshot_t
into a class with all its data members marked private with public accessor functions.

Further non-compatibility-affecting changes include:
- Added DWARF-5 support to the drsyms library by linking in 4 static libraries
Expand Down Expand Up @@ -203,11 +203,15 @@ Further non-compatibility-affecting changes include:
- Added #dynamorio::drmemtrace::TRACE_MARKER_TYPE_VECTOR_LENGTH marker to indicate the
current vector length for architectures with a hardware defined or runtime changeable
vector length (such as AArch64's SVE scalable vectors).
- Renamed a protected data member in #dynamorio::drmemtrace::analyzer_tmpl_t from
merged_interval_snapshots_ to whole_trace_interval_snapshots_ (may be relevant for
users sub-classing analyzer_tmpl_t).
- Converted #dynamorio::drmemtrace::analysis_tool_tmpl_t::interval_state_snapshot_t
into a class with all its data members marked private with public accessor functions.
- Added a new drmemtrace analyzer option \p -interval_instr_count that enables trace
analyzer interval results for every given count of instrs in each shard. This mode
does not support merging the shard interval snapshots to output the whole-trace
interval snapshots. Instead, the print_interval_results() API is called separately
for each shard with the interval state snapshots of that shard.
- Added a new finalize_interval_snapshots() API to
#dynamorio::drmemtrace::analysis_tool_t to allow the tool to make holistic
adjustments to the interval snapshots after all have been generated, and before
they are used for merging across shards (potentially), and printing the results.

**************************************************
<hr>
Expand Down
137 changes: 90 additions & 47 deletions clients/drcachesim/analysis_tool.h
Original file line number Diff line number Diff line change
Expand Up @@ -189,12 +189,13 @@ template <typename RecordType> class analysis_tool_tmpl_t {
print_results() = 0;

/**
* Struct that stores details of a tool's state snapshot at an interval. This is
* Type that stores details of a tool's state snapshot at an interval. This is
* useful for computing and combining interval results. Tools should inherit from
* this struct to define their own state snapshot structs. Tools do not need to
* supply any values to construct this base struct; they can simply use the
* this type to define their own state snapshot types. Tools do not need to
* supply any values to construct this base class; they can simply use the
* default constructor. The members of this base class will be set by the
* framework automatically.
* framework automatically, and must not be modified by the tool at any point.
* XXX: Perhaps this should be a class with private data members.
*/
class interval_state_snapshot_t {
// Allow the analyzer framework access to private data members to set them
Expand All @@ -220,6 +221,10 @@ template <typename RecordType> class analysis_tool_tmpl_t {
, instr_count_delta_(instr_count_delta)
{
}
// This constructor should be used by tools that subclass
// interval_state_snapshot_t. The data members will be set by the framework
// automatically when the tool returns a pointer to their created object from
// generate_*interval_snapshot or combine_interval_snapshots.
interval_state_snapshot_t()
{
}
Expand Down Expand Up @@ -257,8 +262,9 @@ template <typename RecordType> class analysis_tool_tmpl_t {
// The following fields are set automatically by the analyzer framework after
// the tool returns the interval_state_snapshot_t* in the
// generate_*interval_snapshot APIs. So they'll be available to the tool in
// the combine_interval_snapshots (for the parameter snapshots) and
// print_interval_results APIs via the above public accessor functions.
// the finalize_interval_snapshots(), combine_interval_snapshots() (for the
// parameter snapshots), and print_interval_results() APIs via the above
// public accessor functions.

// Identifier for the shard to which this interval belongs. Currently, shards
// map only to threads, so this is the thread id. Set to WHOLE_TRACE_SHARD_ID
Expand All @@ -280,23 +286,26 @@ template <typename RecordType> class analysis_tool_tmpl_t {
};
/**
* Notifies the analysis tool that the given trace \p interval_id has ended so
* that it can generate a snapshot of its internal state in a struct derived
* that it can generate a snapshot of its internal state in a type derived
* from \p interval_state_snapshot_t, and return a pointer to it. The returned
* pointer will be provided to the tool in later combine_interval_snapshots()
* pointer will be provided to the tool in later finalize_interval_snapshots(),
* and print_interval_result() calls.
*
* \p interval_id is a positive ordinal of the trace interval that just ended.
* Trace intervals have a length equal to the \p -interval_microseconds specified
* to the framework. Trace intervals are measured using the value of the
* #TRACE_MARKER_TYPE_TIMESTAMP markers. The provided \p interval_id
* values will be monotonically increasing but may not be continuous,
* i.e. the tool may not see some \p interval_id if the trace did not have
* any activity in that interval.
* Trace intervals have a length equal to either \p -interval_microseconds or
* \p -interval_instr_count. Time-based intervals are measured using the value
* of the #TRACE_MARKER_TYPE_TIMESTAMP markers. Instruction count intervals are
* measured in terms of shard-local instrs.
*
* The returned \p interval_state_snapshot_t* will be passed to the
* combine_interval_snapshots() API which is invoked by the framework to merge
* multiple \p interval_state_snapshot_t from different shards in the parallel
* mode of the analyzer.
* The provided \p interval_id values will be monotonically increasing. For
* \p -interval_microseconds intervals, these values may not be continuous,
* i.e. the tool may not see some \p interval_id if the trace did not have any
* activity in that interval.
*
* After all interval state snapshots are generated, the list of all returned
* \p interval_state_snapshot_t* is passed to finalize_interval_snapshots()
* to allow the tool the opportunity to make any holistic adjustments to the
* snapshots.
*
* Finally, the print_interval_result() API is invoked with a list of
* \p interval_state_snapshot_t* representing interval snapshots for the
Expand All @@ -313,6 +322,40 @@ template <typename RecordType> class analysis_tool_tmpl_t {
{
return nullptr;
}
/**
* Finalizes the interval snapshots in the given \p interval_snapshots list.
* This callback provides an opportunity for tools to make any holistic
* adjustments to the snapshot list now that we have all of them together. This
* may include, for example, computing the diff with the previous snapshot.
*
* Tools can modify the individual snapshots and also the list of snapshots itself.
* If some snapshots are removed, release_interval_snapshot() will not be invoked
* for them and the tool is responsible to de-allocate the resources. Adding new
* snapshots to the list is undefined behavior; tools should operate only on the
* provided snapshots which were generated in prior generate_*interval_snapshot
* calls.
*
* Tools cannot modify any data set by the framework in the base
* \p interval_state_snapshot_t; note that only read-only access is allowed anyway
* to those private data members via public accessor functions.
*
* In the parallel mode, this is invoked for each list of shard-local snapshots
* before they are possibly merged to create whole-trace snapshots using
* combine_interval_snapshots() and passed to print_interval_result(). In the
* serial mode, this is invoked with the list of whole-trace snapshots before it
* is passed to print_interval_results().
*
* This is an optional API. If a tool chooses to not override this, the snapshot
* list will simply continue unmodified.
*
* Returns whether it was successful.
*/
virtual bool
finalize_interval_snapshots(
std::vector<interval_state_snapshot_t *> &interval_snapshots)
{
return true;
}
/**
* Invoked by the framework to combine the shard-local \p interval_state_snapshot_t
* objects pointed at by \p latest_shard_snapshots, to create the combined
Expand All @@ -338,6 +381,10 @@ template <typename RecordType> class analysis_tool_tmpl_t {
* \p interval_end_timestamp)
* - or if the tool mixes cumulative and delta metrics: some field-specific logic that
* combines the above two strategies.
*
* Note that after the given snapshots have been combined to create the whole-trace
* snapshot using this API, any change made by the tool to the snapshot contents will
* not have any effect.
*/
virtual interval_state_snapshot_t *
combine_interval_snapshots(
Expand All @@ -350,14 +397,14 @@ template <typename RecordType> class analysis_tool_tmpl_t {
* Prints the interval results for the given series of interval state snapshots in
* \p interval_snapshots.
*
* This is currently invoked with the list of whole-trace interval snapshots (for
* the parallel mode, these are the snapshots created by merging the shard-local
* snapshots).
* This is invoked with the list of whole-trace interval snapshots (for the
* parallel mode, these are the snapshots created by merging the shard-local
* snapshots). For the \p -interval_instr_count snapshots in parallel mode, this is
* invoked separately for the snapshots of each shard.
*
* The framework should be able to invoke this multiple times, possibly with a
* different list of interval snapshots. So it should avoid free-ing memory or
* changing global state. This is to keep open the possibility of the framework
* printing interval results for each shard separately in future.
* changing global state.
*/
virtual bool
print_interval_results(
Expand All @@ -370,6 +417,10 @@ template <typename RecordType> class analysis_tool_tmpl_t {
* by \p interval_snapshot is no longer needed by the framework. The tool may
* de-allocate it right away or later, as it needs. Returns whether it was
* successful.
*
* Note that if the tool removed some snapshot from the list passed to
* finalize_interval_snapshots(), then release_interval_snapshot() will not be
* invoked for that snapshot.
*/
virtual bool
release_interval_snapshot(interval_state_snapshot_t *interval_snapshot)
Expand Down Expand Up @@ -476,10 +527,10 @@ template <typename RecordType> class analysis_tool_tmpl_t {
/**
* Notifies the analysis tool that the given trace \p interval_id in the shard
* represented by the given \p shard_data has ended, so that it can generate a
* snapshot of its internal state in a struct derived from \p
* snapshot of its internal state in a type derived from \p
* interval_state_snapshot_t, and return a pointer to it. The returned pointer will
* be provided to the tool in later combine_interval_snapshots() and
* print_interval_result() calls.
* be provided to the tool in later combine_interval_snapshots(),
* finalize_interval_snapshots(), and print_interval_result() calls.
*
* Note that the provided \p interval_id is local to the shard that is
* represented by the given \p shard_data, and not the whole-trace interval. The
Expand All @@ -488,30 +539,22 @@ template <typename RecordType> class analysis_tool_tmpl_t {
* shard-local \p interval_state_snapshot_t corresponding to that whole-trace
* interval.
*
* \p interval_id is a positive ordinal of the trace interval that just ended.
* Trace intervals have a length equal to the \p -interval_microseconds specified
* to the framework. Trace intervals are measured using the value of the
* #TRACE_MARKER_TYPE_TIMESTAMP markers. The provided \p interval_id
* values will be monotonically increasing but may not be continuous,
* i.e. the tool may not see some \p interval_id if the trace shard did not
* have any activity in that interval.
* The \p interval_id field is defined similar to the same field in
* generate_interval_snapshot().
*
* The returned \p interval_state_snapshot_t* will be passed to the
* combine_interval_snapshot() API which is invoked by the framework to merge
* multiple \p interval_state_snapshot_t from different shards in the parallel
* mode of the analyzer.
* The returned \p interval_state_snapshot_t* is treated in the same manner as
* the same in generate_interval_snapshot(), with the following additions:
*
* Finally, the print_interval_result() API is invoked with a list of
* \p interval_state_snapshot_t* representing interval snapshots for the
* whole trace. In the parallel mode of the analyzer, this list is computed by
* combining the shard-local \p interval_state_snapshot_t using the tool's
* combine_interval_snapshot() API.
* In case of \p -interval_microseconds in the parallel mode: after
* finalize_interval_snapshots() has been invoked, the \p interval_state_snapshot_t*
* objects generated at the same time period across different shards are passed to
* the combine_interval_snapshot() API by the framework to merge them to create the
* whole-trace interval snapshots. The print_interval_result() API is then invoked
* with the list of whole-trace \p interval_state_snapshot_t* thus obtained.
*
* The tool must not de-allocate the state snapshot until
* release_interval_snapshot() is invoked by the framework.
*
* An example use case of this API is to create a time series of some output
* metric over the whole trace.
* In case of \p -interval_instr_count in the parallel mode: no merging across
* shards is done, and the print_interval_results() API is invoked for each list
* of shard-local \p interval_state_snapshot_t*.
*/
virtual interval_state_snapshot_t *
generate_shard_interval_snapshot(void *shard_data, uint64_t interval_id)
Expand Down
Loading

0 comments on commit 41b55f2

Please sign in to comment.