A probabilistic speaker clustering for DOA-based diarization

Description

We present a probabilistic speaker clustering and diarization model. Speaker diarization determines “who spoke when” from the recorded conversation of unknown number of people. We formulate this problem as the clustering of sequential auditory features generated by an unknown number of latent mixture components (speakers). We employ a probabilistic model which automatically estimates the number of speakers and time-varying speaker proportions. Experiments with synthetic and real sound recordings confirm that the proposed model can successfully infer the number and features of speakers and obtained better speaker diarization results than conventional models.

Journal

Details 詳細情報について

Report a problem

Back to top