Fast Reinforcement Learning of Dialogue Policies Using Stable Function Approximation

説明

We propose a method to speed up reinforcement learning of policies for spoken dialogue systems. This is achieved by combining a coarse grained abstract representation of states and actions with learning only in frequently visited states. The value of unsampled states is approximated by a linear interpolation of known states. Experiments show that the proposed method effectively optimizes dialogue strategies for frequently visited dialogue states.

詳細情報 詳細情報について

問題の指摘

ページトップへ