[Updated on Apr. 18] Integration of CiNii Articles into CiNii Research


Bibliographic Information

Other Title
  • HMM ニ モトヅイタ シチョウカク テキスト オンセイ ゴウセイ ガゾウ ベースアプローチ
  • HMM-Based Audio-visual Speech Synthesis-Pixel-based Approach
  • 音声合成・変換とその応用

Search this article



This paper describes a technique for text-to-audio-visual speechsynthesis based on hidden Markov models (HMMs), in which lip imagesequences are modeled based on pixel-based approach. To reduce the dimensionality of visual speech feature space, we obtain a set of orthogonal vectors (eigenlips) by principal components analysis (PCA), and use a subset of the PCA coefficients and their dynamic featuresas visual speech parameters.Auditory and visual speech parameters are modeled by HMMs separately, and lip movements are synchronized with auditory speech by usingphoneme boundaries of auditory speech for synthesizing lip imagesequences.We confirmed that the generated auditory speech and lip image sequences are realistic and synchronized naturally.


Citations (3)*help

See more


See more

Related Projects

See more


Report a problem

Back to top