言語・韻律情報及び対話履歴を用いたLSTMベースのターンテイキング推定

劉 超然, 石井 カルロス, 石黒 浩

doi:10.1527/tjsai.c-i65

書誌事項

タイトル別名

LSTM-based Turn-taking Estimation Model using Lexical/Prosodic Contents and Dialog History

抄録

<p>A natural conversation involves rapid exchanges of turns while talking. Taking turns at appropriate timing or intervals is a requisite feature for a dialog system as a conversation partner. We propose a Recurrent Neural Network (RNN) based model that takes the current utterance and the dialog history as its input to classify utterances into turn-taking related classes and estimates the turn-taking timing. The dialog history is represented by a sequence of speaker-specified joint embedding of lexical and prosodic contents. To this end, we trained a neural network to embed the lexical and the prosodic contents into a joint embedding space. To learn meaningful embedding spaces, the prosodic feature sequence from each single utterance is mapped into a fixed-dimensional space using RNN and combined with utterance lexical embedding. These joint embeddings are then shifted to different parts of embedding spaces according to the speakers. Finally, the speaker-specified joint embeddings are used as the input of our proposed model. We tested this model on a spontaneous conversation dataset and confirmed that it outperformed conventional models that use lexical/prosodic features and dialog history without speaker information.</p>

収録刊行物

人工知能学会論文誌

人工知能学会論文誌 34 (2), C-I65_1-9, 2019-03-01

一般社団法人人工知能学会

キーワード

詳細情報詳細情報について

CRID: 1390845713055337984

NII論文ID: 130007606513

DOI: 10.1527/tjsai.c-i65

ISSN: 13468030; 13460714

Web Site: https://www.jstage.jst.go.jp/article/tjsai/34/2/34_C-I65/_pdf

本文言語コード: ja

データソース種別

JaLC
Crossref
CiNii Articles

抄録ライセンスフラグ: 使用不可

言語・韻律情報及び対話履歴を用いたLSTMベースのターンテイキング推定

書誌事項

抄録

収録刊行物

参考文献 (20)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

言語・韻律情報及び対話履歴を用いたLSTMベースのターンテイキング推定

書誌事項

抄録

収録刊行物

参考文献 (20)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について