RuntimeError:unable to find a valid cuDNN algorithm to convolution #110

HWH-2019 · 2023-11-20T08:58:35Z

First, when I train with multiple GPUs on one machine, I meet the error as follows:
RuntimeError: CUDA error: an illegal memory access was encountered

when I set the CUDA_LAUNCH_BLOCKING=1, I got more information about this error:

RuntimeError: unable to find a valid cuDNN algorithm to convolution

I found that someone has encountered this problem, but there is no good solution, and I noticed that this problem is not encountered when training the MOTIFS model.

Is it a problem with the PSGFormer model code itself or the GPUs without enough VRAM as mentioned on the Internet?

and why I didn't get the error: CUDA out of memory

the run-time environment as follows:

# system
GPU RTX3090
cuda 12.0
cudnn 11.8

# run-time
pytorch==1.7.1
torchvision==0.8.2
torchaudio==0.7.2
cudatoolkit=11.0

The text was updated successfully, but these errors were encountered:

Jingkang50 · 2023-12-12T09:02:20Z

Sorry I cannot come up with solution to the problem. Have you solve the problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError:unable to find a valid cuDNN algorithm to convolution #110

RuntimeError:unable to find a valid cuDNN algorithm to convolution #110

HWH-2019 commented Nov 20, 2023

Jingkang50 commented Dec 12, 2023

RuntimeError:unable to find a valid cuDNN algorithm to convolution #110

RuntimeError:unable to find a valid cuDNN algorithm to convolution #110

Comments

HWH-2019 commented Nov 20, 2023

Jingkang50 commented Dec 12, 2023