Supervised Determined Source Separation with Multichannel Variational Autoencoder

  • Hirokazu Kameoka
    Nippon Telegraph and Telephone Corporation, Kanagawa, 243-0198, Japan
  • Li Li
    University of Tsukuba, Ibaraki, 305-8577, Japan
  • Shota Inoue
    University of Tsukuba, Ibaraki, 305-8577, Japan
  • Shoji Makino
    University of Tsukuba, Ibaraki, 305-8577, Japan

抄録

<jats:p> This letter proposes a multichannel source separation technique, the multichannel variational autoencoder (MVAE) method, which uses a conditional VAE (CVAE) to model and estimate the power spectrograms of the sources in a mixture. By training the CVAE using the spectrograms of training examples with source-class labels, we can use the trained decoder distribution as a universal generative model capable of generating spectrograms conditioned on a specified class index. By treating the latent space variables and the class index as the unknown parameters of this generative model, we can develop a convergence-guaranteed algorithm for supervised determined source separation that consists of iteratively estimating the power spectrograms of the underlying sources, as well as the separation matrices. In experimental evaluations, our MVAE produced better separation performance than a baseline method. </jats:p>

収録刊行物

被引用文献 (10)*注記

もっと見る

参考文献 (27)*注記

もっと見る

関連プロジェクト

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ