SelfSupervised-Speech-models-repo

This is for presentation about self supervised models.

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units (Facebook)
https://arxiv.org/abs/2106.07447
https://huggingface.co/docs/transformers/model_doc/hubert
https://ai.facebook.com/blog/hubert-self-supervised-representation-learning-for-speech-recognition-generation-and-compression/
https://blog.devgenius.io/hubert-explained-6ec7c2bf71fc
https://jonathanbgn.com/2021/10/30/hubert-visually-explained.html (* This is very good)
https://github.com/facebookresearch/av_hubert
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing (Microsoft)
https://arxiv.org/abs/2110.13900
https://huggingface.co/docs/transformers/model_doc/wavlm
https://github.com/microsoft/unilm/tree/master/wavlm
Wav2vec2: A Framework for Self-Supervised Learning of Speech Representations
https://arxiv.org/abs/2006.11477
https://huggingface.co/masoudmzb/wav2vec2-xlsr-multilingual-53-fa
https://github.com/Hamtech-ai/wav2vec2-fa
https://jonathanbgn.com/2021/09/30/illustrated-wav2vec-2.html (* This is very good)
https://aws.amazon.com/blogs/machine-learning/fine-tune-and-deploy-a-wav2vec2-model-for-speech-recognition-with-hugging-face-and-amazon-sagemaker/
https://arxiv.org/abs/2107.13530
https://arxiv.org/abs/2104.01027 (ROBUST WAV2VEC 2.0: ANALYZING DOMAIN SHIFT IN SELF-SUPERVISED PRE-TRAINING)
https://huggingface.co/blog/fine-tune-xlsr-wav2vec2 (* This is very good)
https://huggingface.co/models?arxiv=arxiv:2104.01027
https://arxiv.org/abs/2101.06699 (Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition)
https://pytorch.org/tutorials/intermediate/speech_recognition_pipeline_tutorial.html
UniSpeechSAT: SELF-SUPERVISED LEARNING FOR SPEECH RECOGNITION WITH INTERMEDIATE LAYER SUPERVISION (Microsoft)
https://arxiv.org/abs/2112.08778
https://github.com/microsoft/UniSpeech
Compare models:

https://superbbenchmark.org/leaderboard?subset=Public+Set (Superb benchmark)
Other:
https://arxiv.org/pdf/2110.05777.pdf (Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification)
https://arxiv.org/pdf/2206.01685.pdf (Toward a realistic model of speech processing in the brain with self-supervised learning) (* It is Important)
https://syncedreview.com/2019/02/22/yann-lecun-cake-analogy-2-0/
Implement Guide:
https://github.com/facebookresearch/fairseq/tree/main/examples/wav2vec
https://huggingface.co/docs/transformers/model_doc/wavlm#transformers.WavLMForXVector
https://colab.research.google.com/github/m3hrdadfi/notebooks/blob/main/Fine_Tune_XLSR_Wav2Vec2_on_Persian_ShEMO_ASR_with_%F0%9F%A4%97_Transformers_ipynb.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
5-2-Toward-a-realistic-model-of-speech-processing-in-the-brain-with-self-supervised-learning.pdf		5-2-Toward-a-realistic-model-of-speech-processing-in-the-brain-with-self-supervised-learning.pdf
README.md		README.md
sharif_selfsupervised_speech_model_barin_compair.pdf		sharif_selfsupervised_speech_model_barin_compair.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SelfSupervised-Speech-models-repo

About

Releases

Packages

afshari-maryam/SelfSupervised-models-repo

Folders and files

Latest commit

History

Repository files navigation

SelfSupervised-Speech-models-repo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages