DanSto - Japanese Dataset of Short Stories for Evaluating Context Understanding

RAFAL Rzepka, KACPER Dudzic, ARISA Abe, KENJI Araki

doi:10.11517/jsaisigtwo.2023.agi-026_32

<p>This paper introduces a dataset comprising more than 8,000 manually crafted short stories in the Japanese language. The primary objectives of this dataset encompass address-ing the dearth of comparable data in Japanese. Additionally, the dataset provides alternative endings for the narratives through crowdsourcing, ensuring they remain both plausible and marginally less probable than the original ones. This approach contributes to the creation of a testing benchmark that poses heightened challenges for contemporary large language models, particularly when contrasted with analogous benchmarks in English where conclusions are typi- cally dichotomized into correct and incorrect endings. The dataset is further expanded through automated manipulation of subjects and objects, and the study evaluates the performance of popular models across three key tasks: a) predicting story endings, b) substituting antonyms, and c) swapping nouns. Preliminary experiments show that zero-shot GPT-4 capabilities are relatively high, especially in case of recognizing sentences with swapped nouns (94% accuracy) while open-source Japanese LLMs struggle with processing proposed stories.</p>

DanSto - Japanese Dataset of Short Stories for Evaluating Context Understanding

Bibliographic Information

Abstract

Journal

Details 詳細情報について

Export

Report a problem