Spectral Modification for Voice Gender Conversion using Temporal Decomposition
この論文をさがす
抄録
In most state-of-the-art voice gender conversion systems, the converted speech still sounds unnatural, which is mainly attributed to the insufficient smoothness of the converted spectra between frames and ineffective spectral modification. In this paper, we present a new method for voice gender conversion using a speech analysis technique called temporal decomposition (TD). TD is used to model spectral evolution effectively. Instead of modifying speech spectra frame by frame, we only need to modify event targets and event functions, and the smoothness of the converted speech is ensured by the shape of the event functions. To overcome the ineffective spectral modification, we explore Gaussian mixture model (GMM) parameter sets for an input of TD to flexibly model the spectral envelope, and develop a new method of modifying GMM parameters in accordance with formant scaling factors. For transforming fundamental frequencies, our system is based on STRAIGHT, which is a very high-quality vocoder. Experimental results show that the quality of the speech converted by the proposed method is significantly improved.
identifier:https://dspace.jaist.ac.jp/dspace/handle/10119/4888
収録刊行物
-
- Journal of Signal Processing
-
Journal of Signal Processing 11 (4), 333-336, 2007-07
Research Institute of Signal Processing Japan(信号処理学会)
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1050282812513890816
-
- NII論文ID
- 120000861659
-
- NII書誌ID
- AA11147833
-
- ISSN
- 13426230
-
- NDL書誌ID
- 8918499
-
- 本文言語コード
- en
-
- 資料種別
- journal article
-
- データソース種別
-
- IRDB
- NDL
- CiNii Articles