-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decoding speed and accuracy on the transformed onnx model #42
Comments
Hi @yangyi0818, Thank you for reporting the issue!
And about the second point, I would like to know the following information:
The latest Conformer-related issue is not yet fixed, and I'm trying to solve it! |
Hi @Masao-Someki ! Thank you for your kind reply! About the first point: What is your device? CPU or GPU? Am I right that your model was constructed with Conformer encoder and Transformer decoder? Did you use LM for the inference? There are two Conformer blocks in ESPnet, the legacy and the latest versions. Which block did you use? I see quantization is applied to your model. Did you execute your quantized model on GPU? About the second point: |
Thank you! |
What is your torch version? |
HI @rookie0607 |
In relation to the slow speed, can you check how many cores are loaded when you try to inference with onnx as i suspect it could be related? |
@joazoa
Currently, there is no script to limit the number of threads in import onnxruntime as ort
sess_options = ort.SessionOptions()
sess_options.inter_op_num_threads = 1
sess_options.intra_op_num_threads = 1
self.encoder = onnxruntime.InferenceSession(
self.config.quantized_model_path,
providers=providers,
sess_options=sess_options
) |
@Masao-Someki thank you! |
Hi, thanks for you share of the espnet_onnx system!
I met two problems when I tried to inference thorough your codes. My acoustic model is trained by myself on our own dataset. The AM architecture is the typical Conformer. I downloaded this code on June.
First, the decoding speed is too slow by it. When using torch to decode, the RTF is around 2.32; however it becomes around 20 when using the transformed onnx.
Second, the CER calculated in the torch version is 7.8% while for the onnx, it becomes 10.6%. I think it is probably wrong.
I'm giving some configs here:
export.py
And I get an onnx dir structured like:
asr/onnx/speech2text/
config.yaml
feats_stats.npz
full/
quantize/
The test wav is a filelist, structured as:
The decoding process is:
decode.py
Furthermore, I noticed that you have mentioned there may be some problems for Conformer AM considering ASR in latest issue, has it been fixed?
Looking forward for your reply!
The text was updated successfully, but these errors were encountered: