A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching

Jacob Whitehill and Javier Movellan
Machine Perception Laboratory
University of California, San Diego
{ jake, movellan }@mplab.ucsd.edu

For over half a century, computer scientists and psychologists have strived to build machines that teach humans automatically, sometimes dubbed "intelligent tutoring systems" (ITS). The earliest such systems focused on "flashcard"-style vocabulary learning, while more modern ITS can tutor students in diverse subjects such as high school geometry, physics, algebra, and computer programming. Compared to human tutors, most contemporary ITS still use a rather impoverished set of low-bandwidth sensors consisting of mouse clicks, keyboard strokes, and (more recently) touch events. In contrast, human teachers utilize not only students' explicit answers to practice problems and test questions, but also auditory and visual information about the students' emotional, or "affective", states to make decisions. It is possible that, if automated teaching systems were "affect-sensitive" and could reliably detect and respond to their students' emotions, then they could teach even more effectively. 

Affect-sensitive teaching systems have emerged as a hot topic within the ITS community over the last 5 years. However, the benefits of affect-sensitivity to teaching and ITS are not well understood, and harnessing affective state information to achieve superior learning gains has so far proved an elusive goal (D'Mello, et al. 2010). To date, the existing affect-sensitive ITS have been built using hand-crafted sets of rules that map students' detected emotional states into actions (Woolf, et al. 2009; D'Mello, et al. 2010). The efficacy of these rule-based approaches is unclear, however, and as the bandwidth and number of "affective sensors" (e.g., web cameras, heart rate monitors, etc.) grows, it will become increasingly difficult to construct such rule sets.

Instead of rule-based approaches to affect-sensitive teaching, a principled computational framework for decision-making such as stochastic optimal control theory may be useful. Optimal control theory provides mathematical infrastructure to define affect-sensitive teaching as an optimization problem as well as computational tools to solve the optimization problem. The Partially Observable Markov Decision Process (POMDP), in particular, is a useful framework for integrating noisy sensor observations from the student, including keyboard presses, touch events, and emotion data captured through a webcam, into the decision-making process in order to minimize some cost. However, given the well-known intractability issues of computing optimal POMDP policies exactly, more research is needed on how to find approximately optimal teaching policies that work well in practice.

In this talk we present a prototype ITS based on POMDPs that teaches students foreign language vocabulary by image association, in the manner of Rosetta Stone and Duolingo. The system's controller was developed by modeling the student as a Bayesian learner and then employing a policy gradient approach to optimize the teacher's control policy in simulation. In contrast to previously used forward-search methods (Rafferty, et al. 2011), this approach shifts the computational burden of planning offline, thus allowing for deeper search and possibly better policies. In an experiment on 90 human subjects in which the independent variable was time-to-mastery, the optimized control policy outperforms two baseline controllers. In addition, we propose and demonstrate in simulation a simple architecture for how affective sensor inputs on the student's "engagement" can be integrated into the decision-making process so as to increase learning efficiency. This result represents, to our knowledge, the first computational account of how affect sensitivity can benefit teaching.


S.K. D’Mello, R.W. Picard, and A.C. Graesser. Towards an affect-sensitive autotutor. IEEE Intelligent Systems, Special issue on Intelligent Educational Systems, 22(4), 2007.

Anna Rafferty, Emma Brunskill, Thomas Griffiths, and Patrick Shafto. Faster teaching by POMDP planning. In Artificial intelligence in Education, 2011.

Beverly Woolf, Winslow Burleson, Ivon Arroyo, Toby Dragon, David Cooper, and Rosalind Picard. Affect-aware tutors: recognising and responding to student affect. International Journal of Learning Technology, 4(3):129–164, 2009.