DDS: A new device-degraded speech dataset for speech enhancement

Li, Haoyu, Yamagishi, Junichi

doi:10.48550/arxiv.2109.07931

DDS: A new device-degraded speech dataset for speech enhancement

DOI DOI オープンアクセス

書誌事項

公開日: 2022-09-18

DOI

10.21437/interspeech.2022-441
10.48550/arxiv.2109.07931

公開者: ISCA

説明

A large and growing amount of speech content in real-life scenarios is being recorded on consumer-grade devices in uncontrolled environments, resulting in degraded speech quality. Transforming such low-quality device-degraded speech into high-quality speech is a goal of speech enhancement (SE). This paper introduces a new speech dataset, DDS, to facilitate the research on SE. DDS provides aligned parallel recordings of high-quality speech (recorded in professional studios) and a number of versions of low-quality speech, producing approximately 2,000 hours speech data. The DDS dataset covers 27 realistic recording conditions by combining diverse acoustic environments and microphone devices, and each version of a condition consists of multiple recordings from six microphone positions to simulate different noise and reverberation levels. We also test several SE baseline systems on the DDS dataset and show the impact of recording diversity on performance.

Submitted to Interspeech 2022

収録刊行物

Interspeech 2022

Interspeech 2022 2913-2917, 2022-09-18

ISCA

キーワード

詳細情報詳細情報について

CRID

1872553967739853312
DOI

10.21437/interspeech.2022-441

10.48550/arxiv.2109.07931
データソース種別
- OpenAIRE

書き出し

問題の指摘

ページトップへ

DDS: A new device-degraded speech dataset for speech enhancement

書誌事項

説明

収録刊行物

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について