球面調和関数展開に基づく近接音抽出を用いた時間-周波数マスク推定による近接／遠方音分離

西口 草太

doi:10.15002/00022730

説明

We propose the combination of a physical-model-based and a deep-learning (DL)-based source separation for near- and far-field source separation. The DL-based near- and far-field source separation method uses spherical-harmonic-analysis-based acoustic features. Deep learning is a state-of-the-art technique for source separation. In this approach, a bidirectional long short term memory (BLSTM) is used to predict a time-frequency (T-F) mask. To accurately predict a T-F mask, it is necessary to use acoustic features that have high mutual information with the oracle T-F mask. In this study, low-frequency-band near- and far-field sources are estimated based on spherical harmonic analysis and used as acoustic features. Subsequently, a DNN predicts a T-F mask to separate all frequency bands. Our experimental results show that the proposed method improved the signal-to-distortion-rate by 8-10 dB compared to the harmonic-analysis-based method. IIn addition, the proposed method improved the PESQ and STOI compared to the conventional DL-based T-F mask estimation method.

収録刊行物

法政大学大学院紀要. 情報科学研究科編

法政大学大学院紀要. 情報科学研究科編 15 1-6, 2020-03-24

法政大学大学院情報科学研究科

詳細情報詳細情報について

CRID: 1390290699808131072

NII論文ID: 120006897135

NII書誌ID: AA12746425

DOI: 10.15002/00022730

HANDLE: 10114/00022730

ISSN: 24321192

本文言語コード: ja

データソース種別

JaLC
IRDB
CiNii Articles

抄録ライセンスフラグ: 使用可

球面調和関数展開に基づく近接音抽出を用いた時間-周波数マスク推定による近接／遠方音分離

この論文をさがす

説明

収録刊行物

詳細情報詳細情報について

書き出し

問題の指摘

球面調和関数展開に基づく近接音抽出を用いた時間-周波数マスク推定による近接／遠方音分離

この論文をさがす

説明

収録刊行物

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について