Real-time Voice Adaptation with Abstract Normalization and Sound-indexed Based Search

MIDTLYNG Mads Alexander

doi:10.15002/00012917

This paper proposes a two-step system to conduct real-time voice adaptation in the field of speech processing. The first step includes recording and pre-processing to form a voice profile. Secondly is real-time input of the voice and adapting the input into a target voice. Concerning the fact that individual voices’ structure are habitually varying, this paper suggests a method for converting them into a comparable format. The new method is called abstract normalization which cuts the voice data into smaller sounds. From the sounds are generated an abstracted, simplified version of the data using a level of abstraction along with parameter fitting. The normalized data is used to generate a sound-index which consists of a sequence hash that represents the current object in a simpler fashion. The indices are used to compare different sounds/voices for adaptation. This effectively transforms the speech-related challenges into a search problem rather than a biometric one. To assess the approach, voice profile data are compared against each other as a method to verify the sound-index. Lastly a real-time voice input using alternating levels of abstraction is run against a voice profile created with Norwegian words. The degree of adaptation success is measured in percentage, and experimental results show that while accuracy is not yet excellent, the concept was validated.

Real-time Voice Adaptation with Abstract Normalization and Sound-indexed Based Search

この論文をさがす

説明

収録刊行物

詳細情報詳細情報について

書き出し

問題の指摘

Real-time Voice Adaptation with Abstract Normalization and Sound-indexed Based Search

この論文をさがす

説明

収録刊行物

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について