Speech-Section Extraction Using Lip Movement and Voice Information in Japanese
-
- Nakamura Etsuro
- Graduate School of Engineering Science, Akita University
-
- Kageyama Yoichi
- Graduate School of Engineering Science, Akita University
-
- Hirose Satoshi
- Japan Business Systems, Inc.
この論文をさがす
抄録
<p>In recent years, several Japanese companies have attempted to improve the efficiency of their meetings, which has been a significant challenge. For instance, voice recognition technology is used to considerably improve meeting minutes creation. In an automatic minutes-creating system, identifying the speaker to add speaker information to the text would substantially improve the overall efficiency of the process. Therefore, a few companies and research groups have proposed speaker estimation methods; however, it includes challenges, such as requiring advance preparation, special equipment, and multiple microphones. These problems can be solved by using speech sections that are extracted from lip movements and voice information. When a person speaks, voice and lip movements occur simultaneously. Therefore, the speaker’s speech section can be extracted from videos by using lip movement and voice information. However, when this speech section contains only voice information, the voiceprint information of each meeting participant is required for speaker identification. When using lip movements, the speech section and speaker position can be extracted without the voiceprint information. Therefore, in this study, we propose a speech-section extraction method that uses image and voice information in Japanese for speaker identification. The proposed method consists of three processes: i) the extraction of speech frames using lip movements, ii) the extraction of speech frames using voices, and iii) the classification of speech sections using these extraction results. We used video data to evaluate the functionality of the method. Further, the proposed method was compared with state-of-the-art techniques. The average F-measure of the proposed method is determined to be higher than that of the conventional methods that are based on state-of-the-art techniques. The evaluation results showed that the proposed method achieves state-of-the-art performance using a simpler process compared to the conventional method.</p>
収録刊行物
-
- Journal of Advanced Computational Intelligence and Intelligent Informatics
-
Journal of Advanced Computational Intelligence and Intelligent Informatics 27 (1), 54-63, 2023-01-20
富士技術出版株式会社
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390857777804746368
-
- NII書誌ID
- AA12042502
-
- ISSN
- 18838014
- 13430130
-
- NDL書誌ID
- 032607454
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- NDL
- Crossref
-
- 抄録ライセンスフラグ
- 使用不可