Return to Past and Current Projects
State-denoised recurrent neural nets
The human brain can be described as a collection of processing pathways---some
perceptual, some motor, and some performing abstract cognitive operations. General
human intelligence arises by virtue of the fact that pathways can be flexibly
configured and interconnected in accordance with task demands. A popular
computational perspective on human consciousness (Dehaene, Louder, & Kouider,
2017) treats consciousness as a type of blackboard enabling communication
among pathways. As a step toward AI systems with human flexibility,
this project proposes a method to ensure that modular pathways can
communicate with one another by requiring that they speak a common language,
i.e., the output from one pathway is expressed in a form that other
pathways have previously learned to process.
Consider two pathways in cascade, call them A and B. When A
produces an output that is similar to inputs in B's past training
history, B is likely to produce the desired behavior. But when A's
output is noisy, that noise propagates to B with possible ill consequences.
We explore the idea that information flow can be enhanced by introducing
a regularizer that guides the A output to be interpretable by B.
Although A and B might be separately trained modules, they could
equally well be the lower and upper halves of a deep net, or time steps
t and t+1 of an unfolded recurrent net used to recognize or generate
sequences. In our initial experiments, we have focused on the latter, noting
that a poorly trained RNN will be susceptible to noise, either in the input
or the hidden state, because this noise can amplify over the sequence.
To suppress noise, we introduce attractor dynamics that operate between
steps of the sequence to regularize the hidden state. The attractor
dynamics are trained on a task-orthogonal objective that iteratively denoises
hidden states, analogous to a denoising autoencoder (Vincent, Larochelle,
Bengio, & Manzagol, 2008).
In initial experiments on four sequence classification tasks, we've shown
that this state-denoised recurrent net (SDRNN), which projects the
hidden state to 'familiar' attractors, obtains superior out-of-sample
performance over a vanilla RNN (either tanh or GRU based), and over the
same architecture without the denoising objective.
Students
Denis Kazakov (Computer Science, Boulder)