-
- MUKAI Natsumi
- 石川高専専攻科電子機械工学専攻
-
- KITAGUCHI Sunao
- 石川高専電子情報工学科
-
- ARAI Takayuki
- 上智大学
Bibliographic Information
- Other Title
-
- ミュージカル映画中の音声区間検出
- ミュージカル エイガチュウ ノ オンセイ クカン ケンシュツ
Search this article
Description
The process of detecting portions involving utterances, which is essential for captioning films, is generally carried out manually by translators at present. Robust methods are inevitable for automatic voice activity detection (VAD) in films involving other irrelevant sound information such as background music. This paper proposes a new feature for automatic VAD. The proposed method utilizes the gradient of spectrum in high-frequency domain (4-6kHz) and the standard deviation of modulation-filtered cepstrum. For evaluation experiments, we used a portion (about 23 minutes) of an English musical film. The proposed method exhibits a 22.6% reduction in total error rate compared to the conventional one utilizing the short time energy.
Journal
-
- National Institute of Technology,Ishikawa College Bulletin
-
National Institute of Technology,Ishikawa College Bulletin 39 (0), 51-56, 2007
National Institute of Technology,Ishikawa College
- Tweet
Details 詳細情報について
-
- CRID
- 1390282679592824832
-
- NII Article ID
- 110006407980
-
- NII Book ID
- AN00014363
-
- ISSN
- 24242152
- 02866110
-
- NDL BIB ID
- 8898754
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- CiNii Articles
-
- Abstract License Flag
- Disallowed