You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Minitube regularly starts , videos are correctly searched and played, but screen completely freezes (without artifacts) after a variable time. When computer freezes, mouse stops moving, keyboard does not interact with the graphical environment, audio and video stop playing, I cannon access to text console (e.g. with CTRL+ALT+F1). Fortunately, the Magic Sys Req [1] is still active and I can get kernel stack traces and/or shutdown the crippled computer.
This an example of a kernel call trace generated during a lock up :
feb 14 09:11:18 kernel: INFO: task Xorg:715 blocked for more than 120 seconds.
feb 14 09:11:18 kernel: Not tainted 6.1.0-3-amd64 #1 Debian 6.1.8-1
feb 14 09:11:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
feb 14 09:11:18 kernel: task:Xorg state:D stack:0 pid:715 ppid:678 flags:0x00400004
feb 14 09:11:18 kernel: Call Trace:
feb 14 09:11:18 kernel: <TASK>
feb 14 09:11:18 kernel: __schedule+0x351/0xa20
feb 14 09:11:18 kernel: schedule+0x5d/0xe0
feb 14 09:11:18 kernel: schedule_preempt_disabled+0x14/0x30
feb 14 09:11:18 kernel: __ww_mutex_lock.constprop.0+0x577/0x9e0
feb 14 09:11:18 kernel: ? _raw_spin_unlock+0x15/0x30
feb 14 09:11:18 kernel: drm_modeset_lock+0x8d/0xd0 [drm]
feb 14 09:11:18 kernel: drm_crtc_get_sequence_ioctl+0xe8/0x1a0 [drm]
feb 14 09:11:18 kernel: ? drm_wait_vblank_ioctl+0x770/0x770 [drm]
feb 14 09:11:18 kernel: drm_ioctl_kernel+0xc9/0x170 [drm]
feb 14 09:11:18 kernel: drm_ioctl+0x1e7/0x450 [drm]
feb 14 09:11:18 kernel: ? drm_wait_vblank_ioctl+0x770/0x770 [drm]
feb 14 09:11:18 kernel: nouveau_drm_ioctl+0x56/0xb0 [nouveau]
feb 14 09:11:18 kernel: __x64_sys_ioctl+0x90/0xd0
feb 14 09:11:18 kernel: do_syscall_64+0x5b/0xc0
feb 14 09:11:18 kernel: ? fpregs_assert_state_consistent+0x22/0x50
feb 14 09:11:18 kernel: ? exit_to_user_mode_prepare+0x171/0x1c0
feb 14 09:11:18 kernel: ? syscall_exit_to_user_mode+0x17/0x40
feb 14 09:11:18 kernel: ? do_syscall_64+0x67/0xc0
feb 14 09:11:18 kernel: ? syscall_exit_to_user_mode+0x17/0x40
feb 14 09:11:18 kernel: ? do_syscall_64+0x67/0xc0
feb 14 09:11:18 kernel: ? fpregs_assert_state_consistent+0x22/0x50
feb 14 09:11:18 kernel: ? exit_to_user_mode_prepare+0x171/0x1c0
feb 14 09:11:18 kernel: ? syscall_exit_to_user_mode+0x17/0x40
feb 14 09:11:18 kernel: ? do_syscall_64+0x67/0xc0
feb 14 09:11:18 kernel: ? do_syscall_64+0x67/0xc0
feb 14 09:11:18 kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd
feb 14 09:11:18 kernel: RIP: 0033:0x7fe1f211d5f7
feb 14 09:11:18 kernel: RSP: 002b:00007ffc4a5898f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
feb 14 09:11:18 kernel: RAX: ffffffffffffffda RBX: 00007ffc4a589930 RCX: 00007fe1f211d5f7
feb 14 09:11:18 kernel: RDX: 00007ffc4a589930 RSI: 00000000c018643b RDI: 0000000000000010
feb 14 09:11:18 kernel: RBP: 00000000c018643b R08: 000055c05be81990 R09: 0000000000000000
feb 14 09:11:18 kernel: R10: 000055c05c5213c0 R11: 0000000000000246 R12: 00007ffc4a589a50
feb 14 09:11:18 kernel: R13: 0000000000000010 R14: 000055c05bd92500 R15: 00007ffc4a589990
feb 14 09:11:18 kernel: </TASK>
feb 14 09:11:18 kernel: INFO: task kworker/u4:3:8083 blocked for more than 120 seconds.
feb 14 09:11:18 kernel: Not tainted 6.1.0-3-amd64 #1 Debian 6.1.8-1
feb 14 09:11:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
feb 14 09:11:18 kernel: task:kworker/u4:3 state:D stack:0 pid:8083 ppid:2 flags:0x00004000
feb 14 09:11:18 kernel: Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
feb 14 09:11:18 kernel: Call Trace:
feb 14 09:11:18 kernel: <TASK>
feb 14 09:11:18 kernel: __schedule+0x351/0xa20
feb 14 09:11:18 kernel: schedule+0x5d/0xe0
feb 14 09:11:18 kernel: schedule_timeout+0x118/0x150
feb 14 09:11:18 kernel: dma_fence_default_wait+0x1a5/0x260
feb 14 09:11:18 kernel: ? __bpf_trace_dma_fence+0x10/0x10
feb 14 09:11:18 kernel: dma_fence_wait_timeout+0x108/0x130
feb 14 09:11:18 kernel: drm_atomic_helper_wait_for_fences+0x82/0xe0 [drm_kms_helper]
feb 14 09:11:18 kernel: nv50_disp_atomic_commit_tail+0x8d/0x8e0 [nouveau]
feb 14 09:11:18 kernel: ? _raw_spin_unlock+0x15/0x30
feb 14 09:11:18 kernel: ? finish_task_switch.isra.0+0x9b/0x300
feb 14 09:11:18 kernel: process_one_work+0x1c7/0x380
feb 14 09:11:18 kernel: worker_thread+0x4d/0x380
feb 14 09:11:18 kernel: ? rescuer_thread+0x3a0/0x3a0
feb 14 09:11:18 kernel: kthread+0xe9/0x110
feb 14 09:11:18 kernel: ? kthread_complete_and_exit+0x20/0x20
feb 14 09:11:18 kernel: ret_from_fork+0x22/0x30
feb 14 09:11:18 kernel: </TASK>
As you can see, the X server (Xorg) locks up because the nouveau kernel module locks up.
The issue is always replicable. Every time I start playing a video with minitube, the kernel immediately starts reporting errors trapped by the GPU, for example:
Feb 14 09:07:10 kernel: nouveau 0000:01:00.0: gr: TRAP_PROP - TP 0 - 00001000 [RT_LINEAR_MISMATCH] - Address 0000000000
feb 14 09:07:10 kernel: nouveau 0000:01:00.0: gr: TRAP_PROP - TP 0 - e0c: 00000000, e18: 00000000, e1c: 05500010, e20: 00001100, e24: 00020070
feb 14 09:07:10 kernel: nouveau 0000:01:00.0: gr: TRAP_PROP - TP 1 - 00001000 [RT_LINEAR_MISMATCH] - Address 0000000000
feb 14 09:07:10 kernel: nouveau 0000:01:00.0: gr: TRAP_PROP - TP 1 - e0c: 00000000, e18: 00000000, e1c: 05500000, e20: 00001100, e24: 00020070
feb 14 09:07:10 kernel: nouveau 0000:01:00.0: gr: 00200000 [] ch 8 [001f0f6000 Xorg[715]] subc 3 class 8297 mthd 1b0c data 0000f010
feb 14 09:07:10 kernel: nouveau 0000:01:00.0: fb: trapped write at 0020870000 on channel 8 [1f0f6000 Xorg[715]] engine 00 [PGRAPH] client 0b [PROP] subclient 00 [RT0] reason 00000002 [PAGE_NOT_PRESENT]
feb 14 09:07:31 kernel: nouveau 0000:01:00.0: firmware: direct-loading firmware nouveau/nv84_xuc103
feb 14 09:07:31 kernel: nouveau 0000:01:00.0: firmware: direct-loading firmware nouveau/nv84_xuc00f
feb 14 09:07:31 kernel: nouveau 0000:01:00.0: gr: TRAP_PROP - TP 0 - 00000040 [RT_FAULT] - Address 0021e70080
feb 14 09:07:31 kernel: nouveau 0000:01:00.0: gr: TRAP_PROP - TP 0 - e0c: 00000000, e18: 00000000, e1c: 06ac0020, e20: 00001100, e24: 00030000
feb 14 09:07:31 kernel: nouveau 0000:01:00.0: gr: TRAP_PROP - TP 1 - 00000040 [RT_FAULT] - Address 0021e700c0
feb 14 09:07:31 kernel: nouveau 0000:01:00.0: gr: TRAP_PROP - TP 1 - e0c: 00000000, e18: 00000000, e1c: 06ac0030, e20: 00001100, e24: 00030000
feb 14 09:07:31 kernel: nouveau 0000:01:00.0: gr: 00200000 [] ch 8 [001f0f6000 Xorg[715]] subc 3 class 8297 mthd 1b0c data 0000f010
feb 14 09:07:31 kernel: nouveau 0000:01:00.0: fb: trapped write at 0021e700c0 on channel 8 [1f0f6000 Xorg[715]] engine 00 [PGRAPH] client 0b [PROP] subclient 00 [RT0] reason 00000002 [PAGE_NOT_PRESENT]
These errors populate the system logs as soon as minitube starts playing. As you can see from the previous log, the GPU traps some error and it continuously try to recover (reloading the GPU firmware, too), but after a certain number of recoveries it hangs and afterwards the X server hangs, too.
I replicated the issue with both Linux kernel version 5.10.165 and version 6.1.8 , therefore newer kernels are affected, too.
Minitube clearly triggers a malfunction in the nouveau kernel module (the open source kernel module provided by Linux kernel for nvidia graphic cards).
Minitube uses the mpv program as backend for playing audio/video, I tried to modify some parameters used to initialize mpv in lib/media/src/mpv/mediampv.cpp, (the "vo" option, for example) but I had no luck: it continues to trigger the nouveau malfunctions.
I would like to identify and replicate the commands used by Minitube to activate the backend player: can you help me ?
If we identify the offending commands sent to the video player backed, minitube could be configured not to use them. Furthermore, the issue could be reported upstream to Linux kernel developers in reference to the backend program.
The text was updated successfully, but these errors were encountered:
computer-enthusiastic
changed the title
Minitube locks ups the GPU with nouveau kernel module on Debian GNU/Linux
Minitube locks up the GPU with nouveau kernel module on Debian GNU/Linux
Feb 18, 2023
Hello,
I'm experiencing computer freezes during minitube playback due to GPU lock up with nouveau kernel module on Debian GNU (using a nvidia graphic card).
I'm currently using minitube version 3.9.3 with Debian Stable (11.6) for AMD64 architecture and the following graphics configuration:
Minitube regularly starts , videos are correctly searched and played, but screen completely freezes (without artifacts) after a variable time. When computer freezes, mouse stops moving, keyboard does not interact with the graphical environment, audio and video stop playing, I cannon access to text console (e.g. with CTRL+ALT+F1). Fortunately, the Magic Sys Req [1] is still active and I can get kernel stack traces and/or shutdown the crippled computer.
This an example of a kernel call trace generated during a lock up :
As you can see, the X server (Xorg) locks up because the nouveau kernel module locks up.
The issue is always replicable. Every time I start playing a video with minitube, the kernel immediately starts reporting errors trapped by the GPU, for example:
These errors populate the system logs as soon as minitube starts playing. As you can see from the previous log, the GPU traps some error and it continuously try to recover (reloading the GPU firmware, too), but after a certain number of recoveries it hangs and afterwards the X server hangs, too.
I replicated the issue with both Linux kernel version 5.10.165 and version 6.1.8 , therefore newer kernels are affected, too.
Minitube clearly triggers a malfunction in the nouveau kernel module (the open source kernel module provided by Linux kernel for nvidia graphic cards).
Minitube uses the mpv program as backend for playing audio/video, I tried to modify some parameters used to initialize mpv in lib/media/src/mpv/mediampv.cpp, (the "vo" option, for example) but I had no luck: it continues to trigger the nouveau malfunctions.
I would like to identify and replicate the commands used by Minitube to activate the backend player: can you help me ?
If we identify the offending commands sent to the video player backed, minitube could be configured not to use them. Furthermore, the issue could be reported upstream to Linux kernel developers in reference to the backend program.
Let me know if you need more information.
[1] https://it.wikipedia.org/wiki/Magic_Sys_Req
The text was updated successfully, but these errors were encountered: