Discriminative Piecewise Linear Transformation Based on Deep Neural Networks for Noise Robust Automatic Speech Recognition

KASHIWAGI Yosuke, SAITO Daisuke, MINEMATSU Nobuaki, HIROSE Keikichi

doi:10.14923/transinfj.2015pdp0009

Bibliographic Information

Other Title

雑音環境下音声認識のためのディープニューラルネットワークを用いた識別的区分線形変換

Abstract

In this paper, we proposed a use of deep neural networks to expand statistical feature enhancement methods based on piecewise linear transformation. Our proposed method estimates the clean speech features from noisy speech features to achieved noise robustness for automatic speech recognition. Firstly, we characterize the distribution of clean features as a Gaussian mixture model and then, by using deep neural networks, estimate discriminatively the distribution in the clean space that an input noisy feature corresponds to. As a result, the model can achieve discriminative ability and generalization ability. Compared with the Stereo-based piecewise linear compensation for environments (SPLICE) which is one of the conventional piecewise linear transformation approaches, experimental evaluations using the Aurora-2 dataset demonstrate that our proposed method can reduce the word error rate (WER) by 53.72% relative in known noise condition.

Journal

電子情報通信学会論文誌D 情報・システム

電子情報通信学会論文誌D 情報・システム J99-D (3), 255-263, 2016-03-01

The Institute of Electronics, Information and Communication Engineers

Keywords

Details 詳細情報について

CRID: 1390846637104433664

DOI: 10.14923/transinfj.2015pdp0009

ISSN: 18810225; 18804535

Text Lang: ja

Data Source

JaLC

Abstract License Flag: Disallowed

Export