Single-channel speech separation by using a sparse decomposition with periodic structure

Youji Iiguni, Hiroyuki Okumura, Makoto Nakashizuka

doi:10.1109/ispacs.2009.4806696

In this paper, we propose a single-channel speech separation method by using a sparse decomposition with a periodic signal model. In our separation method, a mixture of speeches is approximated with periodic signals with time-varying amplitude. The decomposition with the periodic signal model is performed under a sparsity penalty. Due to the sparsity penalty, a segment of the speech mixture is decomposed into periodic signals, each of them is a component of the individual speaker. For speech separation, we introduce the clustering using a K-means algorithm for the set of the periodic signals. After the clustering, each cluster is assigned to its corresponding speaker using codebooks that contain spectral features of the speakers. In experiments, comparison with MaxVQ that performs separation on frequency spectrum domain is demonstrated. The experimental results in terms of signal-to-distortion ratio (SDR) show that our method outperforms MaxVQ with less computational cost for assignment of speech components.

Single-channel speech separation by using a sparse decomposition with periodic structure

Description

Journal

Details 詳細情報について

Export

Report a problem