DYNAMIC AND INCREMENTAL EXPLORATION STRATEGY IN FUSION ADAPTIVE RESONANCE THEORY FOR ONLINE REINFORCEMENT LEARNING
Abstract: One of the
fundamental challenges in reinforcement learning is to setup a proper balance
between exploration and exploitation to obtain the maximum cummulative reward
in the long run. Most protocols for exploration bound the overall values to a
convergent level of performance. If new knowledge is inserted or the
environment is suddenly changed, the issue becomes more intricate as the
exploration must compromise the pre-existing knowledge. This paper presents a
type of multi-channel adaptive resonance theory (ART) neural network model
called fusion ART which serves as a fuzzy approximator for reinforcement learning
with inherent features that can regulate the exploration strategy. This
intrinsic regulation is driven by the condition of the knowledge learnt so far
by the agent. The model offers a stable but incremental reinforcement learning
that can involve prior rules as bootstrap knowledge for guiding the agent to
select the right action. Experiments in obstacle avoidance and navigation tasks
demonstrate that in the configuration of learning wherein the agent learns from
scratch, the inherent exploration model in fusion ART model is comparable to
the basic E-greedy policy. On the other hand, the model is demonstrated to deal
with prior knowledge and strike a balance between exploration and exploitation.
Author: Budhitama Subagdja
Journal Code: jptkomputergg160009