A probabilistic speaker clustering for DOA-based diarization

Shoko Araki, Katsuhiko Ishiguro, Takeshi Yamada, Tomohiro Nakatani

doi:10.1109/aspaa.2009.5346517

A probabilistic speaker clustering for DOA-based diarization

Description

We present a probabilistic speaker clustering and diarization model. Speaker diarization determines “who spoke when” from the recorded conversation of unknown number of people. We formulate this problem as the clustering of sequential auditory features generated by an unknown number of latent mixture components (speakers). We employ a probabilistic model which automatically estimates the number of speakers and time-varying speaker proportions. Experiments with synthetic and real sound recordings confirm that the proposed model can successfully infer the number and features of speakers and obtained better speaker diarization results than conventional models.

Journal

2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics 241-244, 2009-10-01

IEEE

Details 詳細情報について

CRID

1871709542568860032
DOI

10.1109/aspaa.2009.5346517
Data Source
- OpenAIRE

A probabilistic speaker clustering for DOA-based diarization

Description

Journal

Details 詳細情報について

Export

Report a problem