Improving speech emotion dimensions estimation using a three-layer model of human perception

Elbarougy Reda, Akagi Masato

doi:10.1250/ast.35.86

この論文をさがす

説明

Most previous studies using the dimensional approach mainly focused on the direct relationship between acoustic features and emotion dimensions (valence, activation, and dominance). However, the acoustic features that correlate to valence dimension are very few and very weak. As a result, the valence dimension has been particularly difficult to predict. The purpose of this research is to construct a speech emotion recognition system that has the ability to precisely estimate values of emotion dimensions especially valence. This paper proposes a three-layer model to improve the estimating values of emotion dimensions from acoustic features. The proposed model consists of three layers: emotion dimensions in the top layer, semantic primitives in the middle layer, and acoustic features in the bottom layer. First, a top-down acoustic feature selection method based on this model was conducted to select the most relevant acoustic features for each emotion dimension. Then, a button-up method was used to estimate values of emotion dimensions from acoustic features by firstly using fuzzy inference system (FIS) to estimate the degree of each semantic primitive from acoustic features, then using another FIS to estimate values of emotion dimensions from the estimated degrees of semantic primitives. The experimental results reveal that the constructed emotion recognition system based on the proposed three-layer model outperforms the conventional system.

収録刊行物

Acoustical Science and Technology

Acoustical Science and Technology 35 (2), 86-98, 2014

一般社団法人日本音響学会

キーワード

詳細情報詳細情報について

CRID: 1390282680066249600

NII論文ID: 130003390799; 120005399223

NII書誌ID: AA11501808

DOI: 10.1250/ast.35.86

ISSN: 13475177; 03694232; 13463969

NDL書誌ID: 025307257

Web Site: http://hdl.handle.net/10119/11935; http://id.ndl.go.jp/bib/025307257; https://ndlsearch.ndl.go.jp/books/R000000004-I025307257; https://www.jstage.jst.go.jp/article/ast/35/2/35_E1331/_pdf

本文言語コード: en

資料種別: journal article

データソース種別

JaLC
IRDB
NDLサーチ
Crossref
CiNii Articles
OpenAIRE

抄録ライセンスフラグ: 使用不可

書き出し

問題の指摘

Improving speech emotion dimensions estimation using a three-layer model of human perception

この論文をさがす

説明

収録刊行物

被引用文献 (4)*注記

参考文献 (15)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

Improving speech emotion dimensions estimation using a three-layer model of human perception

この論文をさがす

説明

収録刊行物

被引用文献 (4)*注記

参考文献 (15)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について