Blind source separation by multilayer neural network classifiers for spectrogram analysis

  • SHIRAISHI Toshihiko
    Graduate School of Environment and Information Sciences, Yokohama National University
  • DOURA Tomoki
    Graduate School of Environment and Information Sciences, Yokohama National University

抄録

<p>This paper describes a novel method for blind source separation using multilayer neural networks when an audio signal has been recorded in a room with reverberation or with moving signal sources. In conventional applications, speech-recognition specialists can identify the signal from a specific speaker in a recording of many speakers by analyzing a spectrogram of the recording. The spectrogram is a visual representation of the time series of frequency spectra of a target signal. To use multilayer neural networks for a similar classification task, the proposed method begins by preparing a spectrogram of a mixed signal using the short-time Fourier transform, which is then regarded as a visual object. The spectrogram is then divided into small time-frequency segments and each segment is classified into a class of the corresponding signal source by the multilayer neural networks. After that, an inverse short-time Fourier transform is employed to extract the separated signals. The paper also evaluates the separation performance of this classification algorithm. With the transformation of the blind source separation problem into a classification problem, multilayer neural network classifiers can be used, and they do not require information about the mixing environment, or statistical characteristics of the target signals, or multiple microphones. Simulated tests indicate that the proposed method achieves good separation performance under conditions with reverberation or moving signal sources. The proposed method may be adapted for separating signals from unknown convolutive mixtures and time-varying systems.</p>

収録刊行物

被引用文献 (2)*注記

もっと見る

参考文献 (1)*注記

もっと見る

詳細情報

問題の指摘

ページトップへ