Quantifying Appropriateness of Summarization Data for Curriculum Learning
-
- Kano Ryuji
- FUJIFILM Business Innovation Corp. School of Engineering, Tokyo Institute of Technology
-
- Taniguchi Tomoki
- FUJIFILM Business Innovation Corp.
-
- Ohkuma Tomoko
- FUJIFILM Business Innovation Corp.
Bibliographic Information
- Other Title
-
- 要約データの適切性定量化を利用したカリキュラムラーニング
Description
<p>Previous research of summarization models regards titles as summaries of source texts. However, much research has reported these training data are noisy. We propose an effective method of curriculum learning to train summarization models from noisy data. Curriculum learning is a method to improve performance by sorting training data based on difficulty or noisiness, and is effective to training models with noisy data. However, previous research never applied curriculum learning to summarization tasks. One aim of this research is to validate the effectiveness of curriculum learning to summarization tasks. In translation tasks, previous research quantified noise using two models trained with noisy and clean corpora. Because such corpora do not exist in summarization fields, it is difficult to apply this method to summarization tasks. Another aim of this research is to propose a model that can quantify noise using a single noisy corpus. The training task of the proposed model, Appropriateness Estimator is to distinguish correct source-summary pairs of from randomly assigned pairs. Throughout the training, the model learns to compute the appropriateness of source-summary pairs. We conduct experiments on three summarization models and verify curriculum learning and our method improves the performance. </p>
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 29 (1), 144-165, 2022
The Association for Natural Language Processing
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390291767636827904
-
- ISSN
- 21858314
- 13407619
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- Crossref
- OpenAIRE
-
- Abstract License Flag
- Disallowed