Machine Learning, Causal Model Search, and Educational Data
Carnegie Mellon University
Many of the questions in educational research are causal. In observational studies involving cognitive tutors or online courses, we often want to know not just whether certain kinds of student behaviors (hint requests, worked examples seen, etc) are associated with learning outcomes, but whether they cause them. In experimental studies, where treatment is randomized and we have good statistical evidence on whether treatment has an overall on learning, we often want to know about the mechanisms by which treatment might influence learning. In these cases and others, machine learning can help, and has already begun to do so.
In the last two decades, enormous progress has been made in formalizing statistical causal models (Pearl, 2000) and on applying machine learning to search for causal models (Spirtes, Glymour and Scheines, 2000). This technology has begun to be used on educational data, but its potential has barely been scratched. In this workshop, I would like to present an overview of the work in machine learning for causal models of educational data that has been going on over the last decade, and make the case for the great potential that I think exists for this technology going forward.
Qualitative causal structure over a given set of
variables can be represented by a directed graph (a causal graph), and
a quantitative causal model can be specified by parameterizing the
conditional distributions of each variable on its immediate parents in
the causal graph. The problem, in a nutshell, is that the number of
causal graphs grows
Appointment: Department of Philosophy, Secondary Appointments: Machine
2See for example, Spirtes, et al, 2000; Silva, Scheines, Glymour, and Spirtes, 2006;
5Ramsey, et al, 2010.
6Chu and Glymour, 2008.
research,9 and many
other disciplines. In education research, Scheines, Leinhardt, Cho, and
Smith (2005) analyzed log data from an online course and found evidence
that printing requests inhibited voluntary interactive comprehension
checks, which in turn positively influenced learning outcomes. In a
follow up study in which they intervened to break the printing
comprehension check, the results were confirmed and learning improved.
In a 2007 paper, Laski and Siegler use causal model search to examine
the mechanisms by which students learn numerical magnitude. In 2008,
Shih, Koedinger, and Scheines found that the time students spent in
reading or reacting to “bottom out hints” in a geometry
tutor indicated whether the student was treating the hint as a way to
avoid thinking (gaming) or as a worked example. Frequent gaming led to
poor learning outcomes while frequent use of hints as worked examples
led to good learning outcomes. In 2010, Shih, Koedinger, and Scheines
used machine learning to construct Hidden Markov models of student
strategies in hint use. In a study on a computerized fractions tutor
for elementary students, Rau and Scheines (2012) used causal models to
examine the mechanisms by which multiple representations of a fraction
If educational researchers have causal questions and data, then machine learning for causal structure can likely be scientifically useful. Software is freely available in a number of forms,11 and research in the methodology has exploded over the last decade.
Arnold, A., Beck, J., and Scheines, R. (2006). "Feature Discovery in the Context of Educational Data Mining: An Inductive Approach." Proceedings of the AAAI 2006 Workshop on Educational Data Mining, Boston, MA.
Chu, T., & Glymour, C. (2008). Search for
Additive Nonlinear Time Series Causal Models. Journal
of Machine Learning Research,
8Jackson and Scheines, 2005.
9Scheines, Leinhardt, Cho, and Smith, 2005.
10Other researchers have begun to use causal modeling on educational data. For example, Joe Beck (at WPI), who has pioneered machine learning for education, has explored causal modeling and search (Dai an Beck, 2011).
11See, for example, Tetrad: www.phil.cmu.edu/projects/tetrad/ .
Jackson, A., and Scheines, R. (2005). “Single
Laski, E. V., & Siegler, R. S. (2007). Is 27 a
big number? Correlational and causal connections among numerical
categorization, number line estimation, and numerical magnitude
comparison. Child Development, 78,
Pearl, J. (2000). Causation: Models of Reasoning and Inference, Cambridge University Press.
Ramsey, J., Hanson, S., Hanson, C., Halcheno, Y.,
Poldrack, R., Glymour C. (2010). Six problems for causal inference from
fMRI. NeuroImage, 49,
Rau, M., and Scheines, R. (forthcoming). Searching for Variables and Models to Investigate Mediators of Learning from Multiple Representations, in Proceedings of the 5th International Conference on Educational Data Mining (EDM 2012)
Scheines, R., (2002), Estimating Latent Causal Influences: TETRAD III Variables Selection and Bayesian Parameter Estimation: Lead and IQ” Handbook of Data Mining and Knowledge Discovery, Pat Hayes, editor, Oxford University Press, 944- 952.
Scheines, R., Leinhardt, G., Smith, J., and Cho, K.
(2005) "Replacing Lecture with Web- Based Course Materials, Journal of Educational Computing Research, 32, 1,
Shih, B., Kenneth R. Koedinger, and Richard Scheines. (2010). ``Discovery of Learning Tactics using Hidden Markov Model Clustering.'' in Proceedings of the 3rd International Conference on Educational Data Mining
Shih, B., Koedinger, K., & Scheines, R. (2008). A Response Time Model for Bottom- Out Hints as Worked Examples. Proceedings of the First Educational Data Mining Conference. (Best Paper Award).
Shipley, W. (2000). Cause and Correlation in Biology: A User's Guide to Path Analysis, Structural Equations, and Causal Inference, Cambridge.
Silva, R., Scheines, R. Glymour, C., and Spirtes, P.
(2006) “Learning the Structure of Linear Latent Structure
Models,” Journal of Machine Learning Research,
Spirtes, P., Glymour, C. and Scheines, R. (2000), Causation, Prediction, and Search 2nd edition, MIT Press, Boston.