Dynamic on-line clustering and state extraction: An approach to symbolic learning

Although recurrent neural nets have been moderately successful in learning to emulate finite-state machines (FSMs), the continuous internal state dynamics of a neural net are not well matched to the discrete behavior of an FSM. We describe an architecture, called DOLCE, that allows discrete states to evolve in a net as learning progresses. DOLCE consists of a standard recurrent neural net trained by gradient descent and an adaptive clustering technique that quantizes the state space. We describe two implementations of DOLCE. The first model, called DOLCE-u, uses an adaptive clustering scheme in an unsupervised mode to determine both the number of clusters and the partitioning of the state space as learning progresses. The second model, called DOLCE-s, uses a Gaussian mixture model in a supervised learning framework to infer the states of an FSM. DOLCE-s is based on the assumption that a finite set of discrete internal states is required for the task, and that the actual network state belongs to this set but has been corrupted by noise due to inaccuracy in the weights. DOLCE-s learns to recover the discrete state with maximum a posteriori probability from the noisy state. Simulations show that both implementations of DOLCE leads to a significant improvement in generalization performance over earlier neural net approaches to FSM induction. The idea of adaptive quantization is applicable not just to FSM induction but to other domains in which one has reason to believe that the internal network state should be discrete.

Retrieve Paper (pdf)