Visual Recognition of Spoken Words Using Optical Flow

NAKAMURA Ryota, AKAMATSU Shigeru

doi:10.11371/wiieej.08-05.0_49

Bibliographic Information

Other Title

動画像のOptical Flowを用いた発声単語認識システム
ドウガゾウノ Optical Flow オモチイタハッセイタンゴニンシキシステム

Search this article

Description

This paper describes an automatic vision-based spoken word recognition system that utilizes, instead of audio signal, visual motion signal which is obtained from motion pictures taken of a region around the mouth during speech. Motion information on each pixel in the input time-series imagery was obtained by computation of optical flow, and feature values representing a spatial configuration of pixel-wise velocities were extracted for each frame image. Both starting and ending points of time for each spoken word were defined using the velocity feature values, and a high dimensional feature vector was obtained to indicate time variation of the velocity distribution within the period of utterance. As a preliminary performance evaluation of the proposed feature in spoken word recognition, discrimination test of five spoken words including A-RI-GA-TO-U and KO-N-NI-CHI-WA was conducted, and fairly promising results were achieved.

Journal

Reports of the Technical Conference of the Institute of Image Electronics Engineers of Japan

Reports of the Technical Conference of the Institute of Image Electronics Engineers of Japan 08-05 (0), 49-55, 2009

The Institute of Image Electronics Engineers of Japan

Keywords

Details 詳細情報について

CRID: 1390282680574565760

NII Article ID: 130005442089

NII Book ID: AN00348041

DOI: 10.11371/wiieej.08-05.0_49

ISSN: 27589218; 02853957

NDL BIB ID: 10211494

Web Site: http://id.ndl.go.jp/bib/10211494; https://ndlsearch.ndl.go.jp/books/R000000004-I10211494

Article Type: journal article

Data Source

JaLC
NDL Search
CiNii Articles
KAKEN

Abstract License Flag: Disallowed

Export

Report a problem