Discriminative Piecewise Linear Transformation Based on Deep Neural Networks for Noise Robust Automatic Speech Recognition

DOI

Bibliographic Information

Other Title
  • 雑音環境下音声認識のためのディープニューラルネットワークを用いた識別的区分線形変換

Abstract

In this paper, we proposed a use of deep neural networks to expand statistical feature enhancement methods based on piecewise linear transformation. Our proposed method estimates the clean speech features from noisy speech features to achieved noise robustness for automatic speech recognition. Firstly, we characterize the distribution of clean features as a Gaussian mixture model and then, by using deep neural networks, estimate discriminatively the distribution in the clean space that an input noisy feature corresponds to. As a result, the model can achieve discriminative ability and generalization ability. Compared with the Stereo-based piecewise linear compensation for environments (SPLICE) which is one of the conventional piecewise linear transformation approaches, experimental evaluations using the Aurora-2 dataset demonstrate that our proposed method can reduce the word error rate (WER) by 53.72% relative in known noise condition.

Journal

Details 詳細情報について

Report a problem

Back to top