i-vectorに基づく発話類似度を用いた非負値行列分解と話者クラスタリングへの適用

福地 佑介, 俵 直弘, 小川 哲司, 小林 哲則

高精度な話者表現とクラスタリングアルゴリズムを統合した新たな話者クラスタリング手法を提案する．従来用いられる話者クラスタリング手法では，データ量が多くなると正確なクラスタリングが困難になるという問題があった．そのような条件下において正確な話者クラスタリングを実現するためには，音響変動に対して頑健なモデルにより話者を表現し，このモデルを用いて各発話を効率的にクラスタリングする手法が必要となる．そこで提案手法では，話者照合の分野で高い精度を達成しているi-vectorを話者の表現として用い，クラスタリング手法として非負値行列分解に基づいた効率的なクラスタリング手法を導入した．本手法の有効性を示すために，CSJデータを用いた話者クラスタリング実験を行い，従来手法と比較して，提案手法が発話データ量の変化に対し頑健に話者クラスタリングが行えることを確認した．We have developed a novel speaker clustering method by integrating highly accurate speaker representation and a clustering algorithm. The conventional method caused significant degradation in clustering accuracy when the number of utterances increased. High-accuracy speaker representation and high-performance clustering method are required to realize robust speaker clustering system against such a condition. For this purpose, we used i-vectors for the speaker representation, which contributes to the realization of high-accuracy speaker verification systems, and efficient non-negative matrix factorization for the clustering algorithm. Experimental results show that the proposed method outperforms the conventional methods, irrespective of the amount of data.

i-vectorに基づく発話類似度を用いた非負値行列分解と話者クラスタリングへの適用

書誌事項

この論文をさがす

説明

収録刊行物

関連プロジェクト

詳細情報詳細情報について

書き出し

問題の指摘

i-vectorに基づく発話類似度を用いた非負値行列分解と話者クラスタリングへの適用

書誌事項

この論文をさがす

説明

収録刊行物

関連プロジェクト

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について