Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to convert Espnet2 model using external (Fbank+Pitch) features to ONNX format #92

Open
PhenixCFLi opened this issue Jun 1, 2023 · 11 comments

Comments

@PhenixCFLi
Copy link

I simply test the espnet_onnx function by train the Espnet2 model using recipe "librispeech_100".
The ONNX conversion working in model train using default features (FBank), which is good.

But when I change the feature_type from default to "fbank_pitch", which will pre-generate fbank+pitch features using Kaldi extractor and use it in subsequent training and decode, error pop up when I using espnet_onnx to convert that trained model.

Can suggest if there is any problem and how to fix it ?

Command I am using is as below
python3 -m espnet_onnx.export --model_type asr --tag conformer_ext_feature --input asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_valid.acc.ave.zip

And here is the error shown
File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/espnet_onnx/export/__main__.py", line 91, in <module> m.export_from_zip( File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/espnet_onnx/export/asr/export_asr.py", line 191, in export_from_zip self.export(model, tag_name, quantize, optimize) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/espnet_onnx/export/asr/export_asr.py", line 91, in export self._export_encoder(enc_model, export_dir, verbose) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/espnet_onnx/export/asr/export_asr.py", line 246, in _export_encoder self._export_model(model, verbose, path) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/espnet_onnx/export/asr/export_asr.py", line 226, in _export_model torch.onnx.export( File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/onnx/__init__.py", line 350, in export return utils.export( File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/onnx/utils.py", line 163, in export _export( File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/onnx/utils.py", line 1074, in _export graph, params_dict, torch_out = _model_to_graph( File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/onnx/utils.py", line 727, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/onnx/utils.py", line 602, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/onnx/utils.py", line 517, in _trace_and_get_graph_from_model trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph( File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/jit/_trace.py", line 1175, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward graph, out = torch._C._create_graph_by_tracing( File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper outs.append(self.inner(*trace_inputs)) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1118, in _slow_forward result = self.forward(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/espnet_onnx/export/asr/models/encoders/conformer.py", line 104, in forward xs_pad, mask = self.embed(feats, mask) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1118, in _slow_forward result = self.forward(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/espnet_onnx/export/asr/models/language_models/embed.py", line 74, in forward return self.model(x, mask) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1118, in _slow_forward result = self.forward(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/espnet_onnx/export/asr/models/language_models/subsampling.py", line 48, in forward x = self.out(x.transpose(1, 2).contiguous().view(b, t, c * f)) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1118, in _slow_forward result = self.forward(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward input = module(input) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1118, in _slow_forward result = self.forward(*input, **kwargs) File "/home/cli/miniconda3/envs/espnet-curr-test/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (24x4864 and 5120x256)

@Masao-Someki
Copy link
Collaborator

@PhenixCFLi
Thank you for reporting the issue! I will check the details of this problem.
Just for clarification, would you tell me the version of the following modules:

  • onnx
  • espnet_onnx
  • espnet
  • pytorch

@PhenixCFLi
Copy link
Author

PhenixCFLi commented Jun 5, 2023

Thank you for your response, here is the info

  • onnx 1.13.1 pypi_0 pypi
  • onnxruntime 1.14.1 pypi_0 pypi
  • espnet-onnx 0.1.10 pypi_0 pypi
  • espnet 202211 dev_0
  • torch 1.12.1+cu113 pypi_0 pypi

Please feel free to let me know if further information is required

@neso613
Copy link

neso613 commented Jun 8, 2023

Hi @PhenixCFLi

Have you encounter with this error -
ModuleNotFoundError: No module named 'espnet_model_zoo.downloader'; 'espnet_model_zoo' is not a package
could you help me to solve this?

@Masao-Someki
Copy link
Collaborator

Hi @neso613,
Try re-install the latest espnet_model_zoo. If you still got the same error, please consider posting your issue on the espnet_model_zoo repository.

@PhenixCFLi
Copy link
Author

PhenixCFLi commented Jun 9, 2023

I think it is not related to espnet_model_zoo
The key issue is below error message,

RuntimeError: mat1 and mat2 shapes cannot be multiplied (24x4864 and 5120x256)

which means the linear layer of embed.out below is not right, which due to the input feature dimension set to 80 by default.

    (embed): Conv2dSubsampling(
      (conv): Sequential(
        (0): Conv2d(1, 256, kernel_size=(3, 3), stride=(2, 2))
        (1): ReLU()
        (2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2))
        (3): ReLU()
      )
      (out): Sequential(
        (0): Linear(in_features=5120, out_features=256, bias=True)
        (1): RelPositionalEncoding(
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )

After my checking, I found that model using external feature (NULL frontend model) is not officially / fully supported (at least in the version I am using), ie.

  1. There is no official way to config feature dimension externally,
  2. FRONTEND model must exist in decoding/conversion, which is not valid when model using external features.

Can help to double confirm if my understand is correct ? Or some suggestion on the two points above ?

Thank you.
Best rgds,
Phenix, 09JUN2023

@Masao-Someki
Copy link
Collaborator

Thank you @PhenixCFLi, I think I get your issue right.
Would you set feats_dim to your feature dimension?

from espnet_onnx.export import ASRModelExport
m = ASRModelExport()
m.set_export_config(
    max_seq_len=5000,
    feats_dim=85
)
m.export_from_pretrained(tag_name, quantize=False, optimize=False)

The feats_dim config will set the feature dimension for encoder input.

@PhenixCFLi
Copy link
Author

Thanks @Masao-Someki
I have done this part.
The thing that blocking is

in "espnet_onnx/asr/model/encoders/encoder.py", line 32,
self.frontend = Frontend(self.config.frontend, providers, use_quantized)

It doesn't allow no frontend modules, so no way to use pre-generated feature w/o frontend.
Do you know if there is any way to skip this or some frontend that will just take in the pre-generated input features ?

@Masao-Someki
Copy link
Collaborator

Sorry for the late replay @PhenixCFLi Now I get it!
You are right, we need to skip that part if we have external feature.
I think there is no way to skip it right now, so I will fix it this weekend.

@PhenixCFLi
Copy link
Author

@Masao-Someki Thank you so much

@Yuanyuan-888
Copy link

Hi, have this been fixed? I still has this kind of errors.

if I set feat_dim via m.set_export_config, it shows the following error:

Traceback (most recent call last):
File "/home/yzhang96/convert_zip.py", line 5, in
m.export_from_zip(
File "/home/yzhang96/.conda/envs/myenv/lib/python3.9/site-packages/espnet_onnx/export/asr/export_asr.py", line 197, in export_from_zip
self.export(model, tag_name, quantize, optimize)
File "/home/yzhang96/.conda/envs/myenv/lib/python3.9/site-packages/espnet_onnx/export/asr/export_asr.py", line 88, in export
encoder=enc_model.get_model_config(model.asr_model, export_dir)
File "/home/yzhang96/.conda/envs/myenv/lib/python3.9/site-packages/espnet_onnx/export/asr/models/encoder_wrapper.py", line 76, in get_model_config
frontend=get_frontend_config(
File "/home/yzhang96/.conda/envs/myenv/lib/python3.9/site-packages/espnet_onnx/export/asr/get_config.py", line 94, in get_frontend_config
raise ValueError("Currently only s3prl is supported.")
ValueError: Currently only s3prl is supported.

otherwise it shows the previous two matrixes do not match error

@Masao-Someki
Copy link
Collaborator

@PhenixCFLi
Sorry for the late reply, I just created the PR to add export/inference without frontend module.
After this PR has merged you can use your custom feature like this:

y = np.random.rand(100, 85) # your feature
onnx_output = onnx_model(y)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants