tts onnx gpu inference time problems #70

1nlplearner · 2022-12-02T07:42:35Z

hi, i met a problem when onnx inference on gpu

onnx inference on gpu slower than onnx cpu inference much time and sometimes faster than gpu pt inference(2 times acceleration)
when i inference same text twice or more, inference achieves 2 time acceleration compare to gpu pt inference
any advicec?
thanks

Masao-Someki · 2022-12-10T10:50:22Z

@1nlplearner
At the first execution, onnxruntime takes a longer time for inference. So please skip the first execution.

1nlplearner · 2022-12-13T03:11:22Z

@1nlplearner At the first execution, onnxruntime takes a longer time for inference. So please skip the first execution.
i think this is the reason, [Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.]
how to solve it?
thanks

1nlplearner mentioned this issue Dec 13, 2022

Quantize model is slower than raw model #69

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tts onnx gpu inference time problems #70

tts onnx gpu inference time problems #70

1nlplearner commented Dec 2, 2022

Masao-Someki commented Dec 10, 2022

1nlplearner commented Dec 13, 2022

tts onnx gpu inference time problems #70

tts onnx gpu inference time problems #70

Comments

1nlplearner commented Dec 2, 2022

Masao-Someki commented Dec 10, 2022

1nlplearner commented Dec 13, 2022