Bakker, B. (2002). Reinforcement Learning with Long Short-Term
Memory. In T.G. Dietterich, S. Becker, and Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems, 14.
Cambridge, MA: MIT Press, p. 1475-1482.
Postscript.
Zipped Postscript.
PDF.
Abstract:
This paper presents reinforcement learning with a Long Short-Term Memory
recurrent neural network: RL-LSTM. Model-free RL-LSTM using Advantage(lambda)
learning and directed exploration can solve non-Markovian tasks with long-term
dependencies between relevant events. This is demonstrated in a T-maze task,
as well as in a difficult variation of the pole balancing task.
Back to Bram Bakker's Homepage