Multiple index combination for Japanese spoken term detection with optimum index selection based on OOV-region classifier
説明
In this paper, a novel index combination method for spoken term detection is proposed. In our method, outputs from four different recognizers (word, syllable, word-syllable, and fragment recognizer) are combined into one confusion network. A novel index-selection method for the multiple index-combination method is then used to suppress the increase of the index size. Two methods are proposed to reduce index size: (1) arc selection and (2) unit selection, both of which are based on an OOV-region classifier score. Experimental results with 39 hours of Japanese lecture recordings showed that the index-selection method achieved a 22% reduction of index size of the best confusion network while maintaining its high accuracy. Compared with the best phoneme-based index from a single recognizer, the proposed method achieved a 25.0% and 14.8% relative error reduction for IV and OOV queries without increasing the index size.
収録刊行物
-
- 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
-
2013 IEEE International Conference on Acoustics, Speech and Signal Processing 8540-8544, 2013-05
IEEE
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1360567185175934720
-
- 資料種別
- journal article
-
- データソース種別
-
- Crossref
- KAKEN
- OpenAIRE