Skip to content

Student Project of the Machine Learning course by Herbert Jaeger at Groningen University. We identify speakers on the Japanese Vowels dataset using bidirectional LSTMs in combination with cubic polynomial resampling.

Notifications You must be signed in to change notification settings

postnubilaphoebus/Jap_Speak_Recog

Repository files navigation

Jap_Speak_Recog

Project of the Machine Learning course at RUG
Dataset: https://archive.ics.uci.edu/ml/datasets/Japanese+Vowels
Google Doc documenting work flow (not cleaned): https://docs.google.com/document/d/1O5-qIBvy6kEe87fou5AGQ7VA5wKAOFOWSFszcSkFB5I/edit?usp=sharing

This project used: Python 3.8.5, Keras 2.4.3, scipy 1.3.1, scikit-learn 0.24.1

Abstract

In this work, we present the results of our Machine Learning project for the Japanese Vowels dataset. For this project, we used a real-life dataset containing spectral recordings of vocal utterances of the Japanese vowels /ae/, recorded from nine male speakers. The task is to match each multidimensional time series with the correct speaker. We compared various preprocessing methods in conjunction with state-of-the-art classifiers and found that resampling the recordings to an equal length using cubic spline interpolation improves classification performance significantly over all classification models. The best performance was obtained by an ensemble of 11 separately trained Long Short Term Memory architectures in combination with cubic spline interpolation and subsequent resampling to an equal length of 26 time steps for each recording, yielding a training accuracy of 99.82% and a testing accuracy of 98.86%.

About

Student Project of the Machine Learning course by Herbert Jaeger at Groningen University. We identify speakers on the Japanese Vowels dataset using bidirectional LSTMs in combination with cubic polynomial resampling.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •