Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#6934: Initial commit for Memory trace synthesis framework from Instruction Replay, useful for kernel memtracing and beyond #6931

Open
wants to merge 56 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
96f4003
Adding an access_region tool
iansseijelly Jul 30, 2024
41666ef
FIX: default access region values
iansseijelly Aug 16, 2024
f74c632
FORMAT: Access Region tools
iansseijelly Aug 16, 2024
3590f1f
ADD: Reuse Pattern Tool Initial Commit
iansseijelly Aug 16, 2024
048ddc0
ADD: options to create reuse pattern tool
iansseijelly Aug 16, 2024
6014432
ADD: Mirgae Initial commit
iansseijelly Aug 16, 2024
c3d03a4
ADD: Fully supporting all ADD modes
iansseijelly Aug 17, 2024
06b01cb
RENAME: mir_op.h -> mir_opc.h
iansseijelly Aug 17, 2024
6070f3e
FIX: slightly smarter get type
iansseijelly Aug 17, 2024
0c97302
ADD: insn and opnd definition
iansseijelly Aug 17, 2024
6c530a8
ADD: using linked list for chained mir in a single drir
iansseijelly Aug 17, 2024
b94fb08
Merge branch 'DynamoRIO:master' into access_pattern
iansseijelly Aug 17, 2024
34b7d22
ADD: linked list implementation
iansseijelly Aug 18, 2024
3e922fd
ADD: major refactor for individualized register operation in gen_ops
iansseijelly Aug 18, 2024
0f9d8b7
FIX: CMake include
iansseijelly Aug 19, 2024
7e95684
ADD: development on generating mir_insn for memref types
iansseijelly Aug 20, 2024
de2bdb7
ADD: temporarily adding print for debugging
iansseijelly Aug 20, 2024
b7d7643
ADD: mir memref composition
iansseijelly Aug 20, 2024
1b68d54
FIX: removing redundant header def
iansseijelly Aug 20, 2024
8fb4f2a
FIX: strdup for correct insn print
iansseijelly Aug 20, 2024
69358fc
ADD: proper handling for store, subject to change
iansseijelly Aug 20, 2024
a5f385d
FIX: setting correct opcodes to addr calc
iansseijelly Aug 20, 2024
d755ba0
FIX: changing the store representation to save 1 instruction
iansseijelly Aug 20, 2024
dba42fe
ADD: Generalizing arith ops
iansseijelly Aug 20, 2024
e91e70f
ADD: temp reg allocation and fixing pc-relative addressing
iansseijelly Aug 21, 2024
0d4f797
ADD: change store encoding + adding push op (WIP)
iansseijelly Aug 21, 2024
fe51787
ADD: pop operation
iansseijelly Aug 21, 2024
a08951b
REFACTOR: Use static mir_opnd_t instead of dynamically allocation
iansseijelly Aug 23, 2024
76c6ece
Merge branch 'DynamoRIO:master' into access_pattern
iansseijelly Aug 23, 2024
8327781
FIX: renaming fe to frontend for clarity
iansseijelly Aug 26, 2024
b88f9a8
FIX: minor fix to update CMake include path
iansseijelly Aug 26, 2024
55ed02b
ADD: mov and call
iansseijelly Sep 3, 2024
e123028
FIX: print hex for mir imm opnds if large
iansseijelly Sep 3, 2024
b792ce0
ADD: lea opcode
iansseijelly Sep 3, 2024
016fc28
ADD: support for flag setting and supressing jump
iansseijelly Sep 3, 2024
95858a6
ADD: support for shift left and right
iansseijelly Sep 3, 2024
8570c13
FIX: printing flags properly
iansseijelly Sep 8, 2024
0e0f3bc
ADD: nop for unwanted instructions
iansseijelly Sep 8, 2024
659bc21
ADD: basic replayer backend
iansseijelly Sep 10, 2024
10fc010
FIX: adding stubs to call replayer from tools
iansseijelly Sep 10, 2024
be94e8a
ADD: basic replayer test
iansseijelly Sep 10, 2024
ed31e38
ADD: more tests for movs and different adds
iansseijelly Sep 11, 2024
33447cf
ADD: basic reg testing, and testplans
iansseijelly Sep 11, 2024
550f38e
ADD: tmp regfile tests
iansseijelly Sep 11, 2024
93205a2
FIX: reverting the flag register logic
iansseijelly Sep 11, 2024
1e90f2a
ADD: test for memory operations
iansseijelly Sep 11, 2024
4b5fe4d
ADD: logging load store in a hardcoded file
iansseijelly Sep 11, 2024
b48e640
ADD: handling variable length registers
iansseijelly Sep 11, 2024
70f1930
FIX: wrong lea opcode format
iansseijelly Sep 11, 2024
0148792
ADD: APIs in frontend to handle variable register length alias
iansseijelly Sep 11, 2024
25b98f4
ADD: support RET operation
iansseijelly Sep 11, 2024
148fddd
TEXT: check off the test plan todo list
iansseijelly Sep 11, 2024
0121325
FIX: renaming abstract_backend
iansseijelly Sep 12, 2024
0836c4a
ADD: basic register dependency analyzer
iansseijelly Sep 12, 2024
94bc3bc
FIX: minor replayer fixes
iansseijelly Sep 12, 2024
9cf3f86
ADD: addr with index and scale for stride
iansseijelly Sep 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions clients/drcachesim/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,11 @@ add_exported_library(drmemtrace_func_view STATIC tools/func_view.cpp)
add_exported_library(drmemtrace_invariant_checker STATIC tools/invariant_checker.cpp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is meant to get feel for all the pieces of this prototype rather than being sent for merging.

To help future PR's for actual merging: note that if this PR were sent as-is for merging our first reaction would be that it's too big: the unified patch
https://patch-diff.githubusercontent.com/raw/DynamoRIO/dynamorio/pull/6931.patch is over 6K lines while https://dynamorio.org/page_code_reviews.html#autotoc_md114 suggests "Review diffs larger than about 1500 lines should be avoided." Generally large changes should be split into smaller pieces and each of those sent separately. If possible during development PR's should be sent early; sometimes an end-to-end stage has to be reached before designs can be settled, in which case the code has to kept in pieces or split later for multiple reviews.

For merging, is there a logical, natural way to split up the code into pieces? If some pieces need others, their separate PR's won't fully run and won't be able to have end-to-end tests, but they could have unit tests or possibly have no tests in the same PR with a message about tests in later PR's that fill in other pieces. For starters, I see an access_region and a reuse_pattern tool: those should each be their own PR's.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see https://dynamorio.org/page_code_reviews.html#sec_commit_messages on conventions on PR description title and body.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So should I open a new issue to acquire an issue number?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as I don't think one exists yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a lot easier to review this prototype code with a good overview and description. Please edit the PR description to describe the components being added here and what each one does.

add_exported_library(drmemtrace_schedule_stats STATIC tools/schedule_stats.cpp)
add_exported_library(drmemtrace_schedule_file STATIC common/schedule_file.cpp)
add_exported_library(drmemtrace_access_region STATIC tools/access_region.cpp)
add_exported_library(drmemtrace_reuse_pattern STATIC tools/reuse_pattern.cpp)

add_subdirectory(mirage)
target_link_libraries(drmemtrace_reuse_pattern mirage)

target_link_libraries(drmemtrace_invariant_checker drdecode drmemtrace_schedule_file)

Expand Down Expand Up @@ -213,6 +218,7 @@ if (BUILD_PT_POST_PROCESSOR)
add_subdirectory(drpt2trace)
endif (BUILD_PT_POST_PROCESSOR)


set(raw2trace_srcs
tracer/raw2trace.cpp
tracer/raw2trace_shared.cpp
Expand Down Expand Up @@ -284,7 +290,7 @@ target_link_libraries(drmemtrace_launcher drmemtrace_simulator drmemtrace_reuse_
drmemtrace_histogram drmemtrace_reuse_time drmemtrace_basic_counts
drmemtrace_opcode_mix drmemtrace_syscall_mix drmemtrace_view drmemtrace_func_view
drmemtrace_raw2trace directory_iterator drmemtrace_invariant_checker
drmemtrace_schedule_stats drmemtrace_record_filter)
drmemtrace_schedule_stats drmemtrace_record_filter drmemtrace_access_region drmemtrace_reuse_pattern)
if (UNIX)
target_link_libraries(drmemtrace_launcher dl)
endif ()
Expand Down Expand Up @@ -316,6 +322,7 @@ include_directories(${CMAKE_CURRENT_SOURCE_DIR})
if (BUILD_PT_POST_PROCESSOR)
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/drpt2trace)
endif (BUILD_PT_POST_PROCESSOR)
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/mirage)

add_exported_library(drmemtrace_analyzer STATIC
analyzer.cpp
Expand Down Expand Up @@ -366,6 +373,8 @@ install_client_nonDR_header(drmemtrace tools/view_create.h)
install_client_nonDR_header(drmemtrace tools/func_view_create.h)
install_client_nonDR_header(drmemtrace tools/filter/record_filter_create.h)
install_client_nonDR_header(drmemtrace tools/filter/record_filter.h)
install_client_nonDR_header(drmemtrace tools/access_region_create.h)
install_client_nonDR_header(drmemtrace tools/reuse_pattern_create.h)
# TODO i#6412: Create a separate directory for non-tracer headers so that
# we can more cleanly separate tracer and raw2trace code.
install_client_nonDR_header(drmemtrace tracer/raw2trace.h)
Expand Down Expand Up @@ -596,6 +605,8 @@ restore_nonclient_flags(drmemtrace_analyzer)
restore_nonclient_flags(drmemtrace_invariant_checker)
restore_nonclient_flags(drmemtrace_schedule_stats)
restore_nonclient_flags(drmemtrace_schedule_file)
restore_nonclient_flags(drmemtrace_access_region)
restore_nonclient_flags(drmemtrace_reuse_pattern)

# We need to pass /EHsc and we pull in libcmtd into drcachesim from a dep lib.
# Thus we need to override the /MT with /MTd.
Expand Down Expand Up @@ -664,6 +675,8 @@ add_win32_flags(drmemtrace_invariant_checker)
add_win32_flags(drmemtrace_schedule_stats)
add_win32_flags(drmemtrace_schedule_file)
add_win32_flags(directory_iterator)
add_win32_flags(drmemtrace_access_region)
add_win32_flags(drmemtrace_reuse_pattern)
add_win32_flags(test_helpers)
if (WIN32 AND DEBUG)
get_target_property(sim_srcs drmemtrace_launcher SOURCES)
Expand Down Expand Up @@ -849,7 +862,8 @@ if (BUILD_TESTS)
drmemtrace_histogram drmemtrace_reuse_time drmemtrace_basic_counts
drmemtrace_opcode_mix drmemtrace_syscall_mix drmemtrace_view drmemtrace_func_view
drmemtrace_raw2trace directory_iterator drmemtrace_invariant_checker
drmemtrace_schedule_stats drmemtrace_analyzer drmemtrace_record_filter)
drmemtrace_schedule_stats drmemtrace_analyzer drmemtrace_record_filter drmemtrace_access_region
drmemtrace_reuse_pattern)
if (UNIX)
target_link_libraries(tool.drcachesim.core_sharded dl)
endif ()
Expand Down
10 changes: 10 additions & 0 deletions clients/drcachesim/analyzer_multi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@
#include "tools/loader/external_config_file.h"
#include "tools/loader/external_tool_creator.h"
#include "tools/filter/record_filter_create.h"
#include "tools/access_region_create.h"
#include "tools/reuse_pattern_create.h"

namespace dynamorio {
namespace drmemtrace {
Expand Down Expand Up @@ -270,6 +272,14 @@ analyzer_multi_t::create_analysis_tool_from_options(const std::string &tool)
} else if (tool == SCHEDULE_STATS) {
return schedule_stats_tool_create(op_schedule_stats_print_every.get_value(),
op_verbose.get_value());
} else if (tool == ACCESS_REGION) {
return access_region_tool_create(op_access_region_stack_start.get_value(),
op_access_region_stack_end.get_value(),
op_access_region_heap_start.get_value(),
op_access_region_heap_end.get_value());

} else if (tool == REUSE_PATTERN) {
return reuse_pattern_tool_create();
} else {
auto ext_tool = create_external_tool(tool);
if (ext_tool == nullptr) {
Expand Down
19 changes: 19 additions & 0 deletions clients/drcachesim/common/options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1053,5 +1053,24 @@ droption_t<bool> op_abort_on_invariant_error(
"total invariant error count is printed at the end; a non-zero error count does not "
"affect the exit code of the analyzer.");

droption_t<uint64_t> op_access_region_stack_start(
DROPTION_SCOPE_ALL, "access_region_stack_start", 0x8000000000000000,
"Start of the stack region to analyze",
"Specifies the start of the stack region");

droption_t<uint64_t> op_access_region_stack_end(
DROPTION_SCOPE_ALL, "access_region_stack_end", 0x7000000000000000,
"End of the stack region to analyze",
"Specifies the end of the stack region");

droption_t<uint64_t> op_access_region_heap_start(
DROPTION_SCOPE_ALL, "access_region_heap_start", 0x5000000000000000,
"Start of the heap region to analyze",
"Specifies the start of the heap region");

droption_t<uint64_t> op_access_region_heap_end(
DROPTION_SCOPE_ALL, "access_region_heap_end", 0x6000000000000000,
"End of the heap region to analyze",
"Specifies the end of the heap region");
} // namespace drmemtrace
} // namespace dynamorio
7 changes: 6 additions & 1 deletion clients/drcachesim/common/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@
#define INVARIANT_CHECKER "invariant_checker"
#define SCHEDULE_STATS "schedule_stats"
#define RECORD_FILTER "record_filter"

#define ACCESS_REGION "access_region"
#define REUSE_PATTERN "reuse_pattern"
// Constants used by specific tools.
#define REPLACE_POLICY_NON_SPECIFIED ""
#define REPLACE_POLICY_LRU "LRU"
Expand Down Expand Up @@ -224,6 +225,10 @@ extern dynamorio::droption::droption_t<uint64_t> op_trim_before_timestamp;
extern dynamorio::droption::droption_t<uint64_t> op_trim_after_timestamp;
extern dynamorio::droption::droption_t<bool> op_abort_on_invariant_error;

extern dynamorio::droption::droption_t<uint64_t> op_access_region_stack_start;
extern dynamorio::droption::droption_t<uint64_t> op_access_region_stack_end;
extern dynamorio::droption::droption_t<uint64_t> op_access_region_heap_start;
extern dynamorio::droption::droption_t<uint64_t> op_access_region_heap_end;
} // namespace drmemtrace
} // namespace dynamorio

Expand Down
40 changes: 40 additions & 0 deletions clients/drcachesim/mirage/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
cmake_minimum_required(VERSION 3.7)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every file should have a header with copyright and license information.


include (../../../make/policies.cmake NO_POLICY_SCOPE)

if (NOT LINUX OR NOT X86 OR NOT X64)
message(FATAL_ERROR "This is only for Linux x86_64.")
endif ()

add_dr_defines()

set(mirage_srcs
dr_mir_api.cpp
# common - shared code
./common/list.cpp
./common/bitmap.cpp
# frontend - translating DRIR to MIR
./fe/gen_ops.cpp
./fe/gen_opnd_api.cpp
./fe/translate_context.cpp
# backend - interpreting MIR
# ./be/*.cpp
# interpreter - MIR specification
./ir/mir_insn.cpp
)

add_library(mirage STATIC ${mirage_srcs})

target_include_directories(mirage PUBLIC
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/common>
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/fe>
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/ir>
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/be>
)

configure_DynamoRIO_decoder(mirage)
add_dependencies(mirage api_headers)
target_link_libraries(mirage)
install_client_nonDR_header(drmemtrace dr_mir_api.h)
DR_export_target(mirage)
install_exported_target(mirage ${INSTALL_CLIENTS_LIB})
21 changes: 21 additions & 0 deletions clients/drcachesim/mirage/common/bitmap.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#include "bitmap.h"

struct bitmap_t *bitmap_create(uint32_t size) {
struct bitmap_t* b = (struct bitmap_t*)malloc(sizeof(struct bitmap_t));
assert(b != NULL);
b->size = size;
b->bits = (bool*)calloc(size, sizeof(bool));
assert(b->bits != NULL);
return b;
}

int bitmap_acquire(struct bitmap_t *bitmap) {
// find the first available bit
for (uint32_t i = 0; i < bitmap->size; i++) {
if (bitmap->bits[i] == false) {
bitmap->bits[i] = true;
return i;
}
}
return -1;
}
19 changes: 19 additions & 0 deletions clients/drcachesim/mirage/common/bitmap.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#ifndef __BITMAP_H__
#define __BITMAP_H__

// a simple bitmap implementation
#include <stdint.h>
#include <stdbool.h>
#include <stdlib.h>
#include "assert.h"


struct bitmap_t {
uint32_t size;
bool *bits;
};

struct bitmap_t *bitmap_create(uint32_t size);
int bitmap_acquire(struct bitmap_t *bitmap);

#endif // __BITMAP_H__
Loading
Loading