The Vanishing Gradient Problem During Learning Recurrent Neural Nets  and Problem Solutions

Sepp Hochreiter

doi:10.1142/s0218488598000094

The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions

Sepp Hochreiter

Institut für Informatik, Technische Universität München, München, D-80290, Germany

Description

<jats:p> Recurrent nets are in principle capable to store past inputs to produce the currently desired output. Because of this property recurrent nets are used in time series prediction and process control. Practical applications involve temporal dependencies spanning many time steps, e.g. between relevant inputs and desired outputs. In this case, however, gradient based learning methods take too much time. The extremely increased learning time arises because the error vanishes as it gets propagated back. In this article the de-caying error flow is theoretically analyzed. Then methods trying to overcome vanishing gradients are briefly discussed. Finally, experiments comparing conventional algorithms and alternative methods are presented. With advanced methods long time lag problems can be solved in reasonable time. </jats:p>

Journal

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 06 (02), 107-116, 1998-04

World Scientific Pub Co Pte Lt

Citations (9)*help

Details 詳細情報について

CRID

1360855569653404544
DOI

10.1142/s0218488598000094
ISSN

17936411

02184885
Web Site

https://www.worldscientific.com/doi/pdf/10.1142/S0218488598000094
Data Source
- Crossref

The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions

Description

Journal

Citations (9)*help

Details 詳細情報について

Export

Report a problem