GIRL: Reward Function Learning Framework Independent of Text Generator Samples for Reinforcement Learning in Text Generation Tasks

TOYAMA Joji, SUZUKI Masahiro, OCHIAI Keiichi, MATSUO Yutaka

doi:10.14923/transinfj.2023dep0009

Bibliographic Information

Other Title

文書生成タスクに対する強化学習応用における文書生成器のサンプルに非依存な報酬関数学習フレームワークの提案

Abstract

In text generation tasks, reinforcement learning is known to be an effective method. Previous research have attempted to learn from data using samples from the text generator. This paper addresses the problem of not being able to quantitatively visualize the progress of the generator's training, caused by the dependency of generator's samples. We propose a framework called Generator-independent Reward Learning, which does not use any samples while learning reward function. We confirmed that our method based on the framework can quantitatively visualize the learning of the text generator and surpass the performance of baseline methods.

Journal

電子情報通信学会論文誌D 情報・システム

電子情報通信学会論文誌D 情報・システム J107-D (5), 348-358, 2024-05-01

The Institute of Electronics, Information and Communication Engineers

Keywords

Details 詳細情報について

CRID: 1390581412163991424

DOI: 10.14923/transinfj.2023dep0009

ISSN: 18810225; 18804535

Text Lang: ja

Data Source

JaLC

Abstract License Flag: Disallowed

Export