Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace divergence on recording of Firefox (Zen4?) #3778

Open
mgaudet opened this issue Jul 9, 2024 · 4 comments
Open

Trace divergence on recording of Firefox (Zen4?) #3778

mgaudet opened this issue Jul 9, 2024 · 4 comments

Comments

@mgaudet
Copy link

mgaudet commented Jul 9, 2024

rr built from 77f88f4.

Trace was happily humming away before failing with Assertion 'ticks_now == trace_ticks' failed to hold. ticks mismatch for 'SYSCALL: epoll_wait'; expected 77360501, got 77360395

Tail of trace dump ``` [FATAL src/ReplaySession.cc:1239:check_ticks_consistency()] (task 995006 (rec:983756) at time 1906187) -> Assertion `ticks_now == trace_ticks' failed to hold. ticks mismatch for 'SYSCALL: epoll_wait'; expected 77360501, got 77360395 Tail of trace dump: { real_time:41871.222892 global_time:1906167, event:`SYSCALLBUF_RESET' tid:985150, ticks:813092284 } { real_time:41871.222909 global_time:1906168, event:`SYSCALL: futex' (state:EXITING_SYSCALL) tid:983743, ticks:20576026562 rax:0x0 rbx:0x681fffa0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x189 rdi:0x7676a0958644 rbp:0x7fffe89abe70 rsp:0x681ffe00 r8:0x0 r9:0xffffffff r10:0x0 r11:0x246 r12:0x0 r13:0x0 r14:0x7676a0958500 r15:0x7676a0958644 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xca fs_base:0x7676b74fe0c0 gs_base:0x0 } { real_time:41871.223927 global_time:1906169, event:`SYSCALL: restart_syscall' (state:EXITING_SYSCALL) tid:983769, ticks:1129484 rax:0xfffffffffffffe00 rbx:0x7676993fffa0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x189 rdi:0x76769fd4ea30 rbp:0x76769e480760 rsp:0x7676993ffe00 r8:0x0 r9:0xffffffff r10:0x0 r11:0x246 r12:0x0 r13:0x0 r14:0x76769fd4e9a0 r15:0x76769fd4ea30 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xca fs_base:0x76769e4816c0 gs_base:0x0 } { real_time:41871.224018 global_time:1906170, event:`SIGNAL: SIGCHLD(async)' tid:983769, ticks:1129484 rax:0xfffffffffffffe00 rbx:0x7676993fffa0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x189 rdi:0x76769fd4ea30 rbp:0x76769e480760 rsp:0x7676993ffe00 r8:0x0 r9:0xffffffff r10:0x0 r11:0x246 r12:0x0 r13:0x0 r14:0x76769fd4e9a0 r15:0x76769fd4ea30 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xca fs_base:0x76769e4816c0 gs_base:0x0 st0:0x0 st1:0x0 st2:0x0 st3:0x0 st4:0x0 st5:0x0 st6:0x403ade0b6b3a76400012 st7:0x403ade0b6b3a76400000 ymm0:0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ymm1:0xffffffffffffffffffffffffffffffff ymm2:0x7676a093a16000007676a093a340 ymm3:0x10000000000007676a0908fe0 ymm4:0x7676a0908fe000007676a093a160 ymm5:0x7676a09e7d7000007676a0904b10 ymm6:0x7676a09049d00000000000000000 ymm7:0x7676a093a16000007676a0908fb0 ymm8:0x7474616d2f656d6f682f3a6e69622f70 ymm9:0xff000000ffff00000000000000 ymm10:0x0 ymm11:0x99999999999999999999999999999999 ymm12:0x20202020202020202020202020202020 ymm13:0x7676b2e3bbcb00007676b2e3bbf1 ymm14:0x7676b2e3bc1800007676b2e3bba8 ymm15:0x0 } { real_time:41871.224099 global_time:1906171, event:`SIGNAL_HANDLER: SIGCHLD(async)' tid:983769, ticks:1129484 rax:0x0 rbx:0x7676993fffa0 rcx:0xffffffffffffffff rdx:0x7676993ff200 rsi:0x7676993ff330 rdi:0x11 rbp:0x76769e480760 rsp:0x7676993ff1f8 r8:0x0 r9:0xffffffff r10:0x0 r11:0x246 r12:0x0 r13:0x0 r14:0x76769fd4e9a0 r15:0x76769fd4ea30 rip:0x7676a7c53480 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xca fs_base:0x76769e4816c0 gs_base:0x0 st0:0x0 st1:0x0 st2:0x0 st3:0x0 st4:0x0 st5:0x0 st6:0x0 st7:0x0 ymm0:0x0 ymm1:0x0 ymm2:0x0 ymm3:0x0 ymm4:0x0 ymm5:0x0 ymm6:0x0 ymm7:0x0 ymm8:0x0 ymm9:0x0 ymm10:0x0 ymm11:0x0 ymm12:0x0 ymm13:0x0 ymm14:0x0 ymm15:0x0 { tid:983769, addr:0x7676993ff1f8, length:0xe88 } } { real_time:41871.224208 global_time:1906172, event:`SYSCALLBUF_FLUSH' tid:983769, ticks:1129522 { syscall:'write', ret:0x1, size:0x10, desched:1 } } { real_time:41871.224218 global_time:1906173, event:`SYSCALL: rt_sigreturn' (state:ENTERING_SYSCALL) tid:983769, ticks:1129522 rax:0xffffffffffffffda rbx:0x7676993fffa0 rcx:0xffffffffffffffff rdx:0x1 rsi:0x7676993ff1d7 rdi:0x2 rbp:0x76769e480760 rsp:0x7676993ff200 r8:0x2 r9:0xffffffff r10:0x0 r11:0x246 r12:0x0 r13:0x0 r14:0x76769fd4e9a0 r15:0x76769fd4ea30 rip:0x7676b6e45329 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xf fs_base:0x76769e4816c0 gs_base:0x0 } { real_time:41871.224227 global_time:1906174, event:`SYSCALLBUF_RESET' tid:983769, ticks:1129522 } { real_time:41871.224259 global_time:1906175, event:`SYSCALL: rt_sigreturn' (state:EXITING_SYSCALL) tid:983769, ticks:1129522 rax:0xca rbx:0x7676993fffa0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x189 rdi:0x76769fd4ea30 rbp:0x76769e480760 rsp:0x7676993ffe00 r8:0x0 r9:0xffffffff r10:0x0 r11:0x246 r12:0x0 r13:0x0 r14:0x76769fd4e9a0 r15:0x76769fd4ea30 rip:0x70000000 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xffffffffffffffff fs_base:0x76769e4816c0 gs_base:0x0 st0:0x0 st1:0x0 st2:0x0 st3:0x0 st4:0x0 st5:0x0 st6:0x403ade0b6b3a76400012 st7:0x403ade0b6b3a76400000 ymm0:0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ymm1:0xffffffffffffffffffffffffffffffff ymm2:0x7676a093a16000007676a093a340 ymm3:0x10000000000007676a0908fe0 ymm4:0x7676a0908fe000007676a093a160 ymm5:0x7676a09e7d7000007676a0904b10 ymm6:0x7676a09049d00000000000000000 ymm7:0x7676a093a16000007676a0908fb0 ymm8:0x7474616d2f656d6f682f3a6e69622f70 ymm9:0xff000000ffff00000000000000 ymm10:0x0 ymm11:0x99999999999999999999999999999999 ymm12:0x20202020202020202020202020202020 ymm13:0x7676b2e3bbcb00007676b2e3bbf1 ymm14:0x7676b2e3bc1800007676b2e3bba8 ymm15:0x0 } { real_time:41871.224328 global_time:1906176, event:`SYSCALL: restart_syscall' (state:ENTERING_SYSCALL) tid:983769, ticks:1129522 rax:0xffffffffffffffda rbx:0x7676993fffa0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x189 rdi:0x76769fd4ea30 rbp:0x76769e480760 rsp:0x7676993ffe00 r8:0x0 r9:0xffffffff r10:0x0 r11:0x246 r12:0x0 r13:0x0 r14:0x76769fd4e9a0 r15:0x76769fd4ea30 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xca fs_base:0x76769e4816c0 gs_base:0x0 } { real_time:41871.224446 global_time:1906177, event:`SYSCALL: epoll_wait' (state:EXITING_SYSCALL) tid:983756, ticks:77359468 rax:0x1 rbx:0x76769c5fffa0 rcx:0xffffffffffffffff rdx:0x20 rsi:0x7676a1584200 rdi:0x15 rbp:0x0 rsp:0x76769c5ffe00 r8:0x7676a08f49f0 r9:0x7676a08f49f0 r10:0xffffffff r11:0x246 r12:0xffffffff r13:0x70904098 r14:0x709040a8 r15:0x70904228 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xe8 fs_base:0x7676a08f56c0 gs_base:0x0 { tid:983756, addr:0x7676a1584200, length:0xc } } { real_time:41871.224558 global_time:1906178, event:`SYSCALLBUF_FLUSH' tid:983756, ticks:77359854 { syscall:'clock_gettime', ret:0x0, size:0x20 } { syscall:'read', ret:0x1, size:0x11, desched:1 } { syscall:'read', ret:0xfffffffffffffff5, size:0x10, desched:1 } } { real_time:41871.224578 global_time:1906179, event:`SYSCALL: waitid' (state:ENTERING_SYSCALL) tid:983756, ticks:77359854 rax:0xffffffffffffffda rbx:0x76769c5fffa0 rcx:0xffffffffffffffff rdx:0x7676a08f4780 rsi:0xf0386 rdi:0x1 rbp:0x7676a08f46a0 rsp:0x76769c5ffe00 r8:0x0 r9:0x0 r10:0x1000005 r11:0x246 r12:0x7676b6db4e30 r13:0x1 r14:0x1000005 r15:0x7676a08f4780 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xf7 fs_base:0x7676a08f56c0 gs_base:0x0 } { real_time:41871.224587 global_time:1906180, event:`SYSCALLBUF_RESET' tid:983756, ticks:77359854 } { real_time:41871.224633 global_time:1906181, event:`SYSCALL: waitid' (state:EXITING_SYSCALL) tid:983756, ticks:77359854 rax:0x0 rbx:0x76769c5fffa0 rcx:0xffffffffffffffff rdx:0x7676a08f4780 rsi:0xf0386 rdi:0x1 rbp:0x7676a08f46a0 rsp:0x76769c5ffe00 r8:0x0 r9:0x0 r10:0x1000005 r11:0x246 r12:0x7676b6db4e30 r13:0x1 r14:0x1000005 r15:0x7676a08f4780 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xf7 fs_base:0x7676a08f56c0 gs_base:0x0 { tid:983756, addr:0x7676a08f4780, length:0x80 } } { real_time:41871.224734 global_time:1906182, event:`SYSCALL: waitid' (state:ENTERING_SYSCALL) tid:983756, ticks:77359880 rax:0xffffffffffffffda rbx:0x76769c5fffa0 rcx:0xffffffffffffffff rdx:0x7676a08f4780 rsi:0xf0386 rdi:0x1 rbp:0x7676a08f46a0 rsp:0x76769c5ffe00 r8:0x0 r9:0x0 r10:0x5 r11:0x246 r12:0x7676b6db4e30 r13:0x1 r14:0x7676a08f4780 r15:0x1 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xf7 fs_base:0x7676a08f56c0 gs_base:0x0 } { real_time:41871.224965 global_time:1906183, event:`SYSCALL: waitid' (state:EXITING_SYSCALL) tid:983756, ticks:77359880 rax:0x0 rbx:0x76769c5fffa0 rcx:0xffffffffffffffff rdx:0x7676a08f4780 rsi:0xf0386 rdi:0x1 rbp:0x7676a08f46a0 rsp:0x76769c5ffe00 r8:0x0 r9:0x0 r10:0x5 r11:0x246 r12:0x7676b6db4e30 r13:0x1 r14:0x7676a08f4780 r15:0x1 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xf7 fs_base:0x7676a08f56c0 gs_base:0x0 { tid:983756, addr:0x7676a08f4780, length:0x80 } } { real_time:41871.225075 global_time:1906184, event:`SYSCALL: rt_sigaction' (state:ENTERING_SYSCALL) tid:983756, ticks:77359923 rax:0xffffffffffffffda rbx:0x11 rcx:0xffffffffffffffff rdx:0x0 rsi:0x7676a08f4680 rdi:0x11 rbp:0x7676a08f47c0 rsp:0x7676a08f4680 r8:0x0 r9:0x0 r10:0x8 r11:0x246 r12:0x0 r13:0x1 r14:0x76766d72c980 r15:0x0 rip:0x7676b6e4540a eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xd fs_base:0x7676a08f56c0 gs_base:0x0 } { real_time:41871.225115 global_time:1906185, event:`SYSCALL: rt_sigaction' (state:EXITING_SYSCALL) tid:983756, ticks:77359923 rax:0x0 rbx:0x11 rcx:0xffffffffffffffff rdx:0x0 rsi:0x7676a08f4680 rdi:0x11 rbp:0x7676a08f47c0 rsp:0x7676a08f4680 r8:0x0 r9:0x0 r10:0x8 r11:0x246 r12:0x0 r13:0x1 r14:0x76766d72c980 r15:0x0 rip:0x7676b6e4540a eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xd fs_base:0x7676a08f56c0 gs_base:0x0 } { real_time:41871.225215 global_time:1906186, event:`SYSCALLBUF_FLUSH' tid:983756, ticks:77360501 { syscall:'epoll_wait', ret:0x0, size:0x10 } } { real_time:41871.225226 global_time:1906187, event:`SYSCALL: epoll_wait' (state:ENTERING_SYSCALL) tid:983756, ticks:77360501 rax:0xffffffffffffffda rbx:0x76769c5fffa0 rcx:0xffffffffffffffff rdx:0x20 rsi:0x7676a1584200 rdi:0x15 rbp:0x0 rsp:0x76769c5ffe00 r8:0x2 r9:0x7676a08f49f0 r10:0xffffffff r11:0x246 r12:0xffffffff r13:0x70904020 r14:0x70904030 r15:0x709041b0 rip:0x70000002 eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0xe8 fs_base:0x7676a08f56c0 gs_base:0x0 } { real_time:41871.225236 global_time:1906188, event:`SYSCALLBUF_RESET' tid:983756, ticks:77360501 } ```

Discovered after a Pernosco trace failed to process (mozilla/user/[email protected]/1720541975/FIaAP0DAj3A)

Machine is a AMD Ryzen Threadripper PRO 7975WX 32-Cores. Zen workaround was set using the python script.

@mgaudet
Copy link
Author

mgaudet commented Jul 9, 2024

(Re-running rr replay -p 983862 -a also fails with same assert and same tick failures)

@mgaudet
Copy link
Author

mgaudet commented Jul 9, 2024

weird. I managed to complete an interactive replay tho.

@khuey
Copy link
Collaborator

khuey commented Jul 9, 2024

Yeah we see occasional divergences on Pernosco with traces uploaded from Zen 4 users. There's some unexplained issue here.

@mgaudet
Copy link
Author

mgaudet commented Jul 9, 2024

Yeah, I wasn't sure this would be a helpful report -- but thought I should have it on file. Let me know if you want anything from me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants