Real-time Voice Adaptation with Abstract Normalization and Sound-indexed Based Search

この論文をさがす

説明

This paper proposes a two-step system to conduct real-time voice adaptation in the field of speech processing. The first step includes recording and pre-processing to form a voice profile. Secondly is real-time input of the voice and adapting the input into a target voice. Concerning the fact that individual voices’ structure are habitually varying, this paper suggests a method for converting them into a comparable format. The new method is called abstract normalization which cuts the voice data into smaller sounds. From the sounds are generated an abstracted, simplified version of the data using a level of abstraction along with parameter fitting. The normalized data is used to generate a sound-index which consists of a sequence hash that represents the current object in a simpler fashion. The indices are used to compare different sounds/voices for adaptation. This effectively transforms the speech-related challenges into a search problem rather than a biometric one. To assess the approach, voice profile data are compared against each other as a method to verify the sound-index. Lastly a real-time voice input using alternating levels of abstraction is run against a voice profile created with Norwegian words. The degree of adaptation success is measured in percentage, and experimental results show that while accuracy is not yet excellent, the concept was validated.

収録刊行物

詳細情報 詳細情報について

  • CRID
    1390572174784298240
  • NII論文ID
    120005754420
  • NII書誌ID
    AA12746425
  • DOI
    10.15002/00012917
  • HANDLE
    10114/12250
  • ISSN
    24321192
  • 本文言語コード
    en
  • 資料種別
    departmental bulletin paper
  • データソース種別
    • JaLC
    • IRDB
    • CiNii Articles
  • 抄録ライセンスフラグ
    使用可

問題の指摘

ページトップへ