Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FATAL src/PerfCounters.cc:404:check_for_ioc_period_bug() errno: EINVAL] ioctl(PERF_EVENT_IOC_PERIOD) failed #3848

Open
Rodrigodd opened this issue Oct 13, 2024 · 1 comment

Comments

@Rodrigodd
Copy link

When I tried to run rr record ls (or for any other program) I get the following error:

$ rr record ls
rr: Saving execution to trace directory `/home/rodrigodd/.local/share/rr/ls-18'.
[FATAL src/PerfCounters.cc:404:check_for_ioc_period_bug() errno: EINVAL] ioctl(PERF_EVENT_IOC_PERIOD) failed
=== Start rr backtrace:
rr(+0xc75a9) [0x5c1300c875a9]
rr(_ZN2rr12FatalOstreamD1Ev+0x6a) [0x5c1300c8a0ba]
rr(_ZN2rr12PerfCounters5startEPNS_4TaskEl+0x2343) [0x5c1300cb5223]
rr(_ZN2rr4Task16resume_executionENS_13ResumeRequestENS_11WaitRequestENS_12TicksRequestEi+0xa3b) [0x5c1300db4eab]
rr(_ZN2rr13RecordSession13task_continueERKNS0_9StepStateE+0x94c) [0x5c1300cc479c]
rr(_ZN2rr13RecordSession11record_stepEv+0x6fc) [0x5c1300ccf3ec]
rr(_ZN2rr13RecordCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x1ce1) [0x5c1300cbd791]
rr(main+0x37e) [0x5c1300c92d4e]
/usr/lib/libc.so.6(+0x25e08) [0x7a1152a34e08]
/usr/lib/libc.so.6(__libc_start_main+0x8c) [0x7a1152a34ecc]
rr(_start+0x25) [0x5c1300bf6615]
=== End rr backtrace
fish: Job 1, 'rr record ls' terminated by signal SIGABRT (Abort)

In src/PerfCounters.cc:404 the overflow period is being changed to 1, but that appears to not be a valid period in my machine. Changing the code to:

diff --git a/src/PerfCounters.cc b/src/PerfCounters.cc
index 116d957e..cd665530 100644
--- a/src/PerfCounters.cc
+++ b/src/PerfCounters.cc
@@ -399,10 +399,17 @@ static void check_for_ioc_period_bug(perf_event_attrs &perf_attr) {
   attr.exclude_kernel = 1;
   ScopedFd bug_fd = start_counter(0, -1, &attr);
 
-  uint64_t new_period = 1;
-  if (ioctl(bug_fd, PERF_EVENT_IOC_PERIOD, &new_period)) {
-    FATAL() << "ioctl(PERF_EVENT_IOC_PERIOD) failed";
+  bool ok = true;
+  for (uint64_t i = 0; i < 1000100; ++i) {
+    bool now_ok = ioctl(bug_fd, PERF_EVENT_IOC_PERIOD, &i) == 0;
+    if (now_ok != ok) {
+      printf("ioctl(PERF_EVENT_IOC_PERIOD) changed behavior at i=%" PRIu64
+             " from %d to %d\n",
+             i, ok, now_ok);
+      ok = now_ok;
+    }
   }
+  FATAL() << "ioctl(PERF_EVENT_IOC_PERIOD) failed";
 
   struct pollfd poll_bug_fd = {.fd = bug_fd, .events = POLL_IN, .revents = 0 };
   poll(&poll_bug_fd, 1, 0);

I can check that the minimum period my machine supports is 32:

$ bin/rr record ls
rr: Saving execution to trace directory `/home/rodrigodd/.local/share/rr/ls-25'.
ioctl(PERF_EVENT_IOC_PERIOD) changed behavior at i=0 from 1 to 0
ioctl(PERF_EVENT_IOC_PERIOD) changed behavior at i=32 from 0 to 1
[FATAL src/PerfCounters.cc:430:check_for_ioc_period_bug() errno: EINVAL] ioctl(PERF_EVENT_IOC_PERIOD) failed

So changing the period to 32 make it work correctly.

System Info

Not sure which information is useful, but below is the kernel version and CPU info.

$ uname -a
Linux nb-rodrigo 6.11.1-zen1-1-zen #1 ZEN SMP PREEMPT_DYNAMIC Mon, 30 Sep 2024 23:49:48 +0000 x86_64 GNU/Linux
$ lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          39 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   4
  On-line CPU(s) list:    0-3
Vendor ID:                GenuineIntel
  Model name:             Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz
    CPU family:           6
    Model:                69
    Thread(s) per core:   2
    Core(s) per socket:   2
    Socket(s):            1
    Stepping:             1
    CPU(s) scaling MHz:   89%
    CPU max MHz:          2600.0000
    CPU min MHz:          800.0000
    BogoMIPS:             4588.94
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtsc
                          p lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 sss
                          e3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb pti tpr_shadow flexpriority ept
                           vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts vnmi
Virtualization features:
  Virtualization:         VT-x
Caches (sum of all):
  L1d:                    64 KiB (2 instances)
  L1i:                    64 KiB (2 instances)
  L2:                     512 KiB (2 instances)
  L3:                     3 MiB (1 instance)
NUMA:
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-3
Vulnerabilities:
  Gather data sampling:   Not affected
  Itlb multihit:          KVM: Mitigation: VMX disabled
  L1tf:                   Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
  Mds:                    Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
  Meltdown:               Mitigation; PTI
  Mmio stale data:        Unknown: No mitigations
  Reg file data sampling: Not affected
  Retbleed:               Not affected
  Spec rstack overflow:   Not affected
  Spec store bypass:      Vulnerable
  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:             Mitigation; Retpolines; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
  Srbds:                  Vulnerable: No microcode
  Tsx async abort:        Not affected
@Rodrigodd
Copy link
Author

Later on, when debugging an application in GDB (using rr replay), I got the error below, which appears to have the same root cause.

[FATAL src/PerfCounters.cc:953:start() errno: EINVAL] ioctl(PERF_EVENT_IOC_PERIOD) failed with period 13
=== Start rr backtrace:
rr(+0xc75a9) [0x5eba8ae595a9]
rr(_ZN2rr12FatalOstreamD1Ev+0x6a) [0x5eba8ae5c0ba]
rr(_ZN2rr12PerfCounters5startEPNS_4TaskEl+0x2343) [0x5eba8ae87223]
rr(_ZN2rr4Task16resume_executionENS_13ResumeRequestENS_11WaitRequestENS_12TicksRequestEi+0xa3b) [0x5eba8af86eab]
rr(_ZN2rr13ReplaySession16continue_or_stepEPNS_10ReplayTaskERKNS0_15StepConstraintsENS_12TicksRequestENS_13ResumeRequestE+0x119) [0x5eba8af092c9]
rr(_ZN2rr13ReplaySession18try_one_trace_stepEPNS_10ReplayTaskERKNS0_15StepConstraintsE+0x119) [0x5eba8af10789]
rr(_ZN2rr13ReplaySession11replay_stepERKNS0_15StepConstraintsE+0x27c) [0x5eba8af1222c]
rr(_ZN2rr14ReplayTimeline33run_forward_to_intermediate_pointERKNS0_4MarkENS0_13ForceProgressE+0x7f4) [0x5eba8af31664]
rr(_ZN2rr14ReplayTimeline16reverse_continueERKSt8functionIFbPNS_10ReplayTaskERKNS_11BreakStatusEEERKS1_IFbvEE+0x3ed9) [0x5eba8af36309]
rr(_ZN2rr9GdbServer14debug_one_stepERNS_10GdbRequestE+0xcb9) [0x5eba8ae3dba9]
rr(_ZN2rr9GdbServer12serve_replayESt10shared_ptrINS_13ReplaySessionEERKNS0_6TargetEPVbRKNS0_15ConnectionFlagsE+0x9eb) [0x5eba8ae3f66b]
rr(_ZN2rr13ReplayCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x21f9) [0x5eba8af02999]
rr(main+0x37e) [0x5eba8ae64d4e]
/usr/lib/libc.so.6(+0x25e08) [0x752bd7834e08]
/usr/lib/libc.so.6(__libc_start_main+0x8c) [0x752bd7834ecc]
rr(_start+0x25) [0x5eba8adc8615]
=== End rr backtrace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant