GIRL: Reward Function Learning Framework Independent of Text Generator Samples for Reinforcement Learning in Text Generation Tasks
-
- TOYAMA Joji
- The University of Tokyo
-
- SUZUKI Masahiro
- The University of Tokyo
-
- OCHIAI Keiichi
- The University of Tokyo
-
- MATSUO Yutaka
- The University of Tokyo
Bibliographic Information
- Other Title
-
- 文書生成タスクに対する強化学習応用における文書生成器のサンプルに非依存な報酬関数学習フレームワークの提案
Abstract
In text generation tasks, reinforcement learning is known to be an effective method. Previous research have attempted to learn from data using samples from the text generator. This paper addresses the problem of not being able to quantitatively visualize the progress of the generator's training, caused by the dependency of generator's samples. We propose a framework called Generator-independent Reward Learning, which does not use any samples while learning reward function. We confirmed that our method based on the framework can quantitatively visualize the learning of the text generator and surpass the performance of baseline methods.
Journal
-
- 電子情報通信学会論文誌D 情報・システム
-
電子情報通信学会論文誌D 情報・システム J107-D (5), 348-358, 2024-05-01
The Institute of Electronics, Information and Communication Engineers
- Tweet
Details 詳細情報について
-
- CRID
- 1390581412163991424
-
- ISSN
- 18810225
- 18804535
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
-
- Abstract License Flag
- Disallowed