Dynamic on-line clustering and state extraction: An approach to symbolic learning
Although recurrent neural nets have been moderately successful in learning to
emulate finite-state machines (FSMs), the continuous internal state dynamics
of a neural net are not well matched to the discrete behavior of an FSM. We
describe an architecture, called DOLCE, that allows discrete states to evolve
in a net as learning progresses. DOLCE consists of a standard recurrent
neural net trained by gradient descent and an adaptive clustering technique
that quantizes the state space. We describe two implementations of DOLCE.
The first model, called DOLCE-u, uses an adaptive clustering scheme in an
unsupervised mode to determine both the number of clusters and the
partitioning of the state space as learning progresses. The second model,
called DOLCE-s, uses a Gaussian mixture model in a supervised learning
framework to infer the states of an FSM. DOLCE-s is based on the assumption
that a finite set of discrete internal states is required for the task, and
that the actual network state belongs to this set but has been corrupted by
noise due to inaccuracy in the weights. DOLCE-s learns to recover the
discrete state with maximum a posteriori probability from the noisy state.
Simulations show that both implementations of DOLCE leads to a significant
improvement in generalization performance over earlier neural net approaches
to FSM induction. The idea of adaptive quantization is applicable not just to
FSM induction but to other domains in which one has reason to believe that
the internal network state should be discrete.
Retrieve Paper (pdf)