"Device to shape host node should not be folded into myelin" failure of TensorRT 10.5 when running trtexec on GPU L4 #4210

sean-xiang-applovin · 2024-10-18T20:15:01Z

Description

I am trying to compile our model with trt, and failed. I locate the problem to our embedding layer. We are using torch.nn.EmbeddingBag. So I create a mini model with it and export it to onnx in torchscript way. Then, I run trtexec with the exported onnx graph, and got this error

[10/18/2024-18:35:44] [E] Error[2]: [myelinBuilderUtils.cpp::getMyelinSupportType::1087] Error Code 2: Internal Error (Assertion result != MyelinSupportType::kREQUIRES_MYELIN || n.getType() != NodeType::kDEVICE_TO_SHAPE_HOST failed. Device to shape host node should not be folded into myelin.)

If I export the model in dynamo way, then I will see error

[10/18/2024-18:51:46] [E] [TRT] ModelImporter.cpp:942: ERROR: onnxOpCheckers.cpp:993 In function checkSequenceEmpty: [8] false [10/18/2024-18:51:46] [E] [TRT] ModelImporter.cpp:942: ERROR: onnxOpCheckers.cpp:901 In function checkConcatFromSequence: [8] false

For the torchscript onnx graph, I have tried to use polygraph to bisect where it is wrong, it seems there's something wrong with the loop subgraph, but I don't know which node in the loop triggers that "Device to shape host node" error.

Can anyone help take a look please? or share any tools that I can dive deep? Basically torch.nn.EmbeddingBag should be supported very smoothly, right? I wonder how other people deal with this layer. Thanks.

Environment

TensorRT Version: 10.5

NVIDIA GPU: L4

NVIDIA Driver Version: 560.35.03

CUDA Version: 12.6

CUDNN Version: 9.5

Operating System: Ubuntu 24.04

Python Version (if applicable): 3.11.9

Tensorflow Version (if applicable):

PyTorch Version (if applicable): 2.5.0

Baremetal or Container (if so, version):

Relevant Files

Repo artifacts: demo.zip

It contains:

a notebook about how I set up the mini model, and export it to onnx, and trtexec it
two exported onnx graphs, in torchscript way and in dynamo way
a onnx data file, when exported in dynamo way

Steps To Reproduce

Commands or scripts:

unzip the demo.zip, run the embedding_bag_repo.ipynb notebook

or you can run trtexec with the onnx file I uploaded in the zip file

trtexec --onnx=demo_embedding_bag_dynamo.onnx
trtexec --onnx=demo_embedding_bag_ts.onnx

Have you tried the latest release?:

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

The text was updated successfully, but these errors were encountered:

yuanyao-nv · 2024-10-19T00:20:08Z

Thanks for reporting this.
The dynamo exporter is still in beta state, but ideally the export should not contain sequence ops if it's only trying to concat/split tensors. Please raise an issue with the pytorch team regarding this.
For the torchscript exported model, I instanced an internal bug to track. Will keep you updated.

yuanyao-nv added Export: torch.onnx https://pytorch.org/docs/stable/onnx.html triaged Issue has been triaged by maintainers internal-bug-tracked labels Oct 19, 2024

sean-xiang-applovin mentioned this issue Oct 21, 2024

Sequence Ops usage when exporting embedding bag into onnx pytorch/pytorch#138485

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Device to shape host node should not be folded into myelin" failure of TensorRT 10.5 when running trtexec on GPU L4 #4210

"Device to shape host node should not be folded into myelin" failure of TensorRT 10.5 when running trtexec on GPU L4 #4210

sean-xiang-applovin commented Oct 18, 2024

yuanyao-nv commented Oct 19, 2024

"Device to shape host node should not be folded into myelin" failure of TensorRT 10.5 when running trtexec on GPU L4 #4210

"Device to shape host node should not be folded into myelin" failure of TensorRT 10.5 when running trtexec on GPU L4 #4210

Comments

sean-xiang-applovin commented Oct 18, 2024

Description

Environment

Relevant Files

Steps To Reproduce

yuanyao-nv commented Oct 19, 2024