Performance evaluation of noisy shouted speech detection based on acoustic model with rahmonic and mel-frequency cepstrum coefficients
-
- Fukumori, Takahiro
- Ritsumeikan Univ.
-
- Nakayama, Masato
- Ritsumeikan Univ.
-
- Nishiura, Takanobu
- Ritsumeikan Univ.
-
- Nanjo, Hiroaki
- Kyoto Univ.
Bibliographic Information
- Other Title
-
- Rahmonicとメルケプストラムを用いた音響モデルに基づく騒音環境下叫び声検出の性能評価
- ポスター講演 Rahmonicとメルケプストラムを用いた音響モデルに基づく騒音環境下叫び声検出の性能評価
- ポスター コウエン Rahmonic ト メルケプストラム オ モチイタ オンキョウ モデル ニ モトズク ソウオン カンキョウ カ サケビ コエ ケンシュツ ノ セイノウ ヒョウカ
Search this article
Description
This paper describes a method based on new combined features with mel-frequency cepstrum coefficients (MFCCs) and rahmonic in order to robustly detect a shouted speech in noisy environments. MFCCs collectively make up mel-frequency cepstrum, and rahmonic shows a subharmonic of fundamental frequency in the cepstrum domain. In our previous method, Gaussian mixture models (GMM) is constructed with the proposed features extracted from training data which includes a lot of normal and shouted speech samples. In this paper, evaluation experiments of noisy shouted speech detection were conducted using not only GMM but also hidden Markov models (HMM) and deep neural network (DNN). The results show that MFCCs and rahmonic were effective for representing an utterance mechanism including both vocal tract and vocal cords. In addition, DNN could achieve higher performance in noisy environments than GMM and HMM.
Journal
-
- IEICE technical report : 信学技報
-
IEICE technical report : 信学技報 116 (477), 283-286, 2017-03
電子情報通信学会
- Tweet
Details 詳細情報について
-
- CRID
- 1050282810831348096
-
- NII Article ID
- 120006382541
- 40021159057
- 40021161193
- 40021160127
-
- NII Book ID
- AA1123312X
-
- ISSN
- 09135685
-
- HANDLE
- 2433/228957
-
- Text Lang
- ja
-
- Article Type
- journal article
-
- Data Source
-
- IRDB
- NDL
- CiNii Articles