Skip to content

Commit

Permalink
[PAL] Do not try to describe RIP location on syscall instruction
Browse files Browse the repository at this point in the history
Commit b6a2d79 ("[PAL/{Linux,Linux-SGX}] Add trace log for raw syscalls")
added a debug print for every encountered raw syscall instruction. Each print
describes what system call number is invoked and at which address in the binary.
The address is fed to the helper function `pal_describe_location(uc->rip)` which
tries to find the binary + function name and put them in human-readable form
into the provided buffer.

For some reason, the snippet used in that commit (allocating a 128-byte buffer
on signal-handling stack and calling `pal_describe_location()`) leads to a
non-deterministic memory corruption on some workloads. That bug is hardly
reproducible, and it is not fixed by e.g. increasing the signal stack size inside
Gramine. The `pal_describe_location()` control path also seems correct.

As the true root cause for this bug is not yet found, this commit introduces
a workaround: temporarily removing the stack-allocated buffer and the
correspoding `pal_describe_location()` call, and instead printing the raw
RIP value.

Signed-off-by: Adarsh Anand <[email protected]>
  • Loading branch information
adarshan-intel committed Oct 14, 2024
1 parent 988e6b8 commit 9de0ec2
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 7 deletions.
7 changes: 4 additions & 3 deletions pal/src/host/linux-sgx/pal_exception.c
Original file line number Diff line number Diff line change
Expand Up @@ -232,9 +232,10 @@ static bool handle_ud(sgx_cpu_context_t* uc, int* out_event_num) {
log_always("Emulating a raw syscall instruction. This degrades performance, consider"
" patching your application to use Gramine syscall API.");
}
char buf[LOCATION_BUF_SIZE];
pal_describe_location(uc->rip, buf, sizeof(buf));
log_trace("Emulating raw syscall instruction with number %lu at address %s", uc->rax, buf);
/* FIXME: better to use `pal_describe_location()` (see example below), but it leads to a
* non-deterministic bug on some workloads */
log_trace("Emulating raw syscall instruction with number %lu at address %p",
uc->rax, (void*)uc->rip);
return false;
} else if (is_in_out(instr) && !has_lock_prefix(instr)) {
/*
Expand Down
8 changes: 4 additions & 4 deletions pal/src/host/linux/pal_exception.c
Original file line number Diff line number Diff line change
Expand Up @@ -91,10 +91,10 @@ static void handle_sync_signal(int signum, siginfo_t* info, struct ucontext* uc)
log_always("Emulating a raw system/supervisor call. This degrades performance, consider"
" patching your application to use Gramine syscall API.");
}
char buf[LOCATION_BUF_SIZE];
pal_describe_location(ucontext_get_ip(uc), buf, sizeof(buf));
log_trace("Emulating raw syscall instruction with number %d at address %s",
info->si_syscall, buf);
/* FIXME: better to use `pal_describe_location()` (see example below), but it leads to a
* non-deterministic bug on some workloads */
log_trace("Emulating raw syscall instruction with number %d at address %p", info->si_syscall,
info->si_call_addr);
}

enum pal_event event = signal_to_pal_event(signum);
Expand Down

0 comments on commit 9de0ec2

Please sign in to comment.