この論文をさがす
説明
We propose the combination of a physical-model-based and a deep-learning (DL)-based source separation for near- and far-field source separation. The DL-based near- and far-field source separation method uses spherical-harmonic-analysis-based acoustic features. Deep learning is a state-of-the-art technique for source separation. In this approach, a bidirectional long short term memory (BLSTM) is used to predict a time-frequency (T-F) mask. To accurately predict a T-F mask, it is necessary to use acoustic features that have high mutual information with the oracle T-F mask. In this study, low-frequency-band near- and far-field sources are estimated based on spherical harmonic analysis and used as acoustic features. Subsequently, a DNN predicts a T-F mask to separate all frequency bands. Our experimental results show that the proposed method improved the signal-to-distortion-rate by 8-10 dB compared to the harmonic-analysis-based method. IIn addition, the proposed method improved the PESQ and STOI compared to the conventional DL-based T-F mask estimation method.
収録刊行物
-
- 法政大学大学院紀要. 情報科学研究科編
-
法政大学大学院紀要. 情報科学研究科編 15 1-6, 2020-03-24
法政大学大学院情報科学研究科
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390290699808131072
-
- NII論文ID
- 120006897135
-
- NII書誌ID
- AA12746425
-
- HANDLE
- 10114/00022730
-
- ISSN
- 24321192
-
- 本文言語コード
- ja
-
- データソース種別
-
- JaLC
- IRDB
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用可