Phoneme recognition in continuous speech using feature selection based on mutual information

説明

Publisher Summary In order to construct a large-vocabulary continuous speech recognition system, it is very important to develop a highly reliable phoneme recognizer. Phoneme characteristics, which are reliable enough for phoneme discrimination do not necessarily correspond to the acoustic features obtained by short-time analysis of a frame. Therefore, the temporal pattern of features over a suitable length for the features should be considered and the various kinds of contextual effects should be organized as compact as possible to classify phonemes. An optimal statistical method to recognize phonemes in continuous speech is discussed in this chapter. The novelty of this method is the evaluation of the effectiveness of acoustic features in each acoustic level using the criterion of mutual information between acoustic feature vectors and phoneme labels assigned to speech wave. In the proposed method for phoneme recognition, the power and its variational pattern, the LPC Mel-Cepstrum and its pattern of temporal change are adopted as the acoustic features. Multi-level clustering is suitable to discriminate phonemes by detecting the most reliable features in that context and using an effective combination of the various acoustic characteristics.

収録刊行物

詳細情報 詳細情報について

問題の指摘

ページトップへ