Skip to content

Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.

Notifications You must be signed in to change notification settings

chorowski-lab/rVAD

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

rVAD

Description

Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD) or speech activity detection (SAD), as presented in rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.

The rVAD method consists of two passes of denoising followed by a VAD stage. It has been applied as a preprocessor for speech recognition, speaker identification, language identification, age and gender identification, human-robot interaction, audio archive segmentation, and so on. More info on the rVAD webpage.

Source code for rVAD:

Source code in Matlab for rVAD (including rVAD-fast) is available under the rVAD2.0 folder. It is straightforward to use: Simply call the function vad.m. Some Matlab functions and their modified versions from the publicly available VoiceBox are included with kind permission of Mike Brookes.

Source code in Python for rVAD-fast is available under the rVADfast_py_2.0 folder.

Reference VAD for Aurora 2 database:

The frame-by-frame reference VAD was generated from the clean set of Aurora 2 using forced-alignment speech recognition and has been used as a 'ground truth' for evaluating VAD algorithms. Our study shows that forced-alignment ASR performs as well as a human expert labeler for generating VAD references, as detailed in Comparison of Forced-Alignment Speech Recognition and Humans for Generating Reference VAD. Here are the generated reference VAD for the training set and the reference VAD for the test set.

About

Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%