Development of methods, algorithms of machine learning for Kazakh speech recognition

Loading...
Thumbnail Image

Date

2021

Journal Title

Journal ISSN

Volume Title

Publisher

Suleyman Demirel University

Abstract

Automatic speech recognition system is on its way to reach its full potential mainly because of the significance of the field and rapid growth of the speech data. However, the growth of speech recognition for low-resourced languages like Kazakh is not observed, since there is not enough data and there is no automated tool for data collection. All existing speech recognition models for Kazakh language are based on datasets that were collected locally and manually, which makes these datasets private and inaccessible for other researchers to use them. Manual speech data collection requires a lot of time and effort, which may not match the quality requirements. Low quality speech data may ultimately affect the performance of speech recognition model considerably. This research work focuses on two important steps of building reliable automatic speech recognition system for Kazakh language – 1) construction of automated speech data collection system, 2) application of transfer learning to recurrent neural network. Automatic data collection system is based on a website, which perfectly segments the audio data and separates these audio files with corresponding transcriptions by speakers. As a result system produced around 100 hours of speech data, which in terms of structure is suitable for neural network to train. Transfer learning technique is based on Russian speech recognition model, which transfers all weights to the neural network that built for Kazakh language. Using transfer learning multilingual Automatic speech recognition model was obtained, which outperforms the simple LSTM based model by 32% and BiLSTM model by 24% (in terms of Label Error Rate).

Description

Keywords

speech recognition, automatic system, transfer learning technique, PHD dissertation

Citation

Kuanyshbay Darkhan / Development of methods, algorithms of machine learning for Kazakh speech recognition / 6D070400 – «Computer Systems and Software» / 2021

Collections