Discriminative Piecewise Linear Transformation Based on Deep Neural Networks for Noise Robust Automatic Speech Recognition
-
- KASHIWAGI Yosuke
- The University of Tokyo
-
- SAITO Daisuke
- The University of Tokyo
-
- MINEMATSU Nobuaki
- The University of Tokyo
-
- HIROSE Keikichi
- The University of Tokyo
Bibliographic Information
- Other Title
-
- 雑音環境下音声認識のためのディープニューラルネットワークを用いた識別的区分線形変換
Abstract
In this paper, we proposed a use of deep neural networks to expand statistical feature enhancement methods based on piecewise linear transformation. Our proposed method estimates the clean speech features from noisy speech features to achieved noise robustness for automatic speech recognition. Firstly, we characterize the distribution of clean features as a Gaussian mixture model and then, by using deep neural networks, estimate discriminatively the distribution in the clean space that an input noisy feature corresponds to. As a result, the model can achieve discriminative ability and generalization ability. Compared with the Stereo-based piecewise linear compensation for environments (SPLICE) which is one of the conventional piecewise linear transformation approaches, experimental evaluations using the Aurora-2 dataset demonstrate that our proposed method can reduce the word error rate (WER) by 53.72% relative in known noise condition.
Journal
-
- 電子情報通信学会論文誌D 情報・システム
-
電子情報通信学会論文誌D 情報・システム J99-D (3), 255-263, 2016-03-01
The Institute of Electronics, Information and Communication Engineers
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390846637104433664
-
- ISSN
- 18810225
- 18804535
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
-
- Abstract License Flag
- Disallowed