Blind and Neural Network-Guided Convolutional Beamformer for Joint Denoising, Dereverberation, and Source Separation
-
- Tomohiro Nakatani
- NTT Corporation,Japan
-
- Keisuke Kinoshita
- NTT Corporation,Japan
-
- Rintaro Ikeshita
- NTT Corporation,Japan
-
- Shoko Araki
- NTT Corporation,Japan
-
- Hiroshi Sawada
- NTT Corporation,Japan
書誌事項
- 公開日
- 2021-06-06
- 権利情報
-
- https://doi.org/10.15223/policy-029
- https://doi.org/10.15223/policy-037
- DOI
-
- 10.1109/icassp39728.2021.9414264
- 10.48550/arxiv.2108.01836
- 公開者
- IEEE
説明
This paper proposes an approach for optimizing a Convolutional BeamFormer (CBF) that can jointly perform denoising (DN), dereverberation (DR), and source separation (SS). First, we develop a blind CBF optimization algorithm that requires no prior information on the sources or the room acoustics, by extending a conventional joint DR and SS method. For making the optimization computationally tractable, we incorporate two techniques into the approach: the Source-Wise Factorization (SW-Fact) of a CBF and the Independent Vector Extraction (IVE). To further improve the performance, we develop a method that integrates a neural network(NN) based source power spectra estimation with CBF optimization by an inverse-Gamma prior. Experiments using noisy reverberant mixtures reveal that our proposed method with both blind and NN-guided scenarios greatly outperforms the conventional state-of-the-art NN-supported mask-based CBF in terms of the improvement in automatic speech recognition and signal distortion reduction performance.
Accepted by IEEE ICASSP 2021
収録刊行物
-
- ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
-
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 6129-6133, 2021-06-06
IEEE
- Tweet
キーワード
- Signal Processing (eess.SP)
- FOS: Computer and information sciences
- Sound (cs.SD)
- Audio and Speech Processing (eess.AS)
- FOS: Electrical engineering, electronic engineering, information engineering
- Electrical Engineering and Systems Science - Signal Processing
- Computer Science - Sound
- Electrical Engineering and Systems Science - Audio and Speech Processing
詳細情報 詳細情報について
-
- CRID
- 1360022497271473536
-
- データソース種別
-
- Crossref
- OpenAIRE

