Pawe
Cichosz
Institute of Electronics Fundamentals
Warsaw University of Technology
Nowowiejska 15/19, 00-665 Warsaw, Poland
cichosz@ipe.pw.edu.pl
http://www.ipe.pw.edu.pl/~cichosz
Combining reinforcement learning algorithms with function approximators
in order to generalize over the state space has recently received particular
interest and is widely believed to be one of the crucial issues for scaling
reinforcement learning to practically interesting domains. This paper
examines the combination of the TTD procedure, a computationally efficient
approximate implementation of TD(
) methods, with CMAC, a function
approximator especially suitable for reinforcement learning due to its
computational efficiency and on-line learning capability. Most of previous
studies have investigated the combination of CMAC with either TD(0)-based
algorithms, which usually learn much slower than for
, or with
the traditional implementation of TD(
) based on eligibility traces,
associated with high computational costs. This study, by combining CMAC with
TTD, attempts to reconcile fast learning with computational efficiency and
generalization capabilities. The presented experimental results show
the successful performance of the Q-learning algorithm implemented using
the TTD procedure and CMAC in two tasks with continuous state spaces.