Michael C. Mozer

Selected Publications

These documents are protected by various copyright laws, but in each case I am allowed to distribute copies to individuals for personal, research use. Your click on any of the links below constitutes your request to me for a personal copy of the linked article, and my delivery of a personal copy. Any other use is prohibited.

Sort by Topic

Sort by Year

2025

Didolkar, A., Zadaianchuk, A., Goyal, A., Mozer, M. C., Bengio, Y., Martius, G., & Seitzer, M. (2025) Zero-shot object-centric learning. International Confgerence on Learning Representations (ICLR). Also arXiv.org:2408.09162 [cs.CV]
Lepori, M. A., Mozer, M. C., & Ghandeharioun, A. (2025). Racing thoughts: Explaing large language model contextualization errors. NAACL Conference. Also arXiv.org:2410.02102 [cs.CL]
Schoepf, S., Mozer, M. C., Mitchell, N., Brintrup, A., Kaissis, G., Kairouz, P., and Triantafillou, E. (2025). Redirection for erasing memory (REM): Towards a universal unlearning method for corrupted data. arXiv.org:2505.17730 [cs.LG]
Siddiqui, S. A., Weller, A., Krueger, D., Dziugaite, G. K., Mozer, M. C., & Triantafillou (2025). From dormant to deleted: Tamper-resistant unlearning through weight-space regularization. arXiv.org:2505.22310 [cs.LG]

2024

2023

2022

2021

Didolkar, A., Goyal, A., Ke, N. R., Blundell, C., Beaudoin, P. Heess, N., Mozer, M. C., & Bengio, Y. (2021). Neural production systems. In Advances in Neural Information Processing Systems 34. Also arXiv:2103.01937 [cs.AI]
Goyal, A., Lamb, A., Gampa, P., Beaudoin, P., Levine, S., Blundell, C., Bengio, Y., & Mozer, M. C., (2021). Object files and schemata: Factorizing declarative and procedural knowledge in dynamical systems. International Conference on Learning Representations. Also arXiv:2006.16225 [cs.LG]
Iuzzolino, M. L., Mozer, M. C., & Bengio, S. (2021). Improving anytime prediction with parallel cascaded networks and a temporal-difference loss. In Advances in Neural Information Processing Systems 34. Also arXiv:2102.09808 [cs.LG] [code repository]
Jiang, Z., Zhang, C., Talwar, K., & Mozer, M. C. (2021). Characterizing structural regularities of labeled data in overparameterized models. Proceedings of the 38th International Conference on Machine Learning, PMLR 139:5034-5044. Also arXiv:2002.03206 [cs.LG]. [code, images, results]
Karandikar, A., Cain, N., Tran, D., Lakshminarayanan, B., Shlens, J., Mozer, M. C., & Roelofs, B. (2021). Soft calibration objectives for neural networks. In Advances in Neural Information Processing Systems 34. Also arXiv.org:2018.00106 [cs.LG]
Kim, B., Reif, E., Wattenberg, M., Bengio, S., & Mozer, M. C. (2021). Neural networks trained on natural scene exhibit Gestalt closure. Computational Brain and Behavior, 4(3), 251-263. https://doi.org/10.1007/s42113-021-00100-7
Kim, D. Y. J., Scott, T. R., Mallick, D., & Mozer, M. C. (2021). Using semantics of textbook highlights to predict student comprehension and knowledge retention. In S. Sosnovsky, P. Brusilovsky, R. G. Baraniuk, & A. S. Lan (Eds.), Proceedings of the Third International Workshop on Intelligent Textbooks (iTextbooks) (pp. 108--120). Springer.
Lamb, A., Goyal, A., Stowik, A., Mozer, M. C., Beaudoin, P., & Bengio, Y. (2021). Neural function modules with sparse arguments: A dynamic approach to integrating information across layers. AISTATS 2021. Also arXiv:2010.08012 [cs.LG]
Li, Z., Mozer, M. C., & Whitehill, J. (2021). Compositional embeddings for multi-label one-shot learning. IEEE Winter Conference on Applications of Computer Vision. Also arXiv:2002.04193 [cs.LG]
Liu, D., Lamb, A., Kawaguchi, K., Goyal, A., Sun, C., Mozer, M. C., & Bengio, Y. (2021). Discrete-valued neural communication in structured architectures enhances generalization.. In Advances in Neural Information Processing Systems 34. Also arXiv.org:2107.02367 [cs.LG]
Mozer, F. S., Bale, S. D., Bonnell, J. W., Drake, J. F., Hanson, E. L. M., & Mozer, M. C. (2021). On the origin of switchbacks observed in the solar wind. Journal of Astrophysics, 919:60, 1--10.
Ren, M., Iuzzolino, M. L., Mozer, M. C., & Zemel, R. S. (2021). Wandering within a world: Online contextualized few-shot learning. International Conference on Learning Representations. Also arXiv:2007.04546 [cs.LG]
Ren, M., Scott, T. R., Iuzzolino, M. L., Mozer, M. C., & Zemel, R. S. (2021). Online unsupervised learning of visual representations and categories. arXiv.org:2109.05675 [cs.LG]
Roads, B. D., & Mozer, M. C. (2021). Predicting the ease of human category learning using radial basis function networks. Neural Computation, 33, 376-397.
Scherrer, N., Bilaniuk, O., Annadani, Y., Goyal, A., Schwab, P. Schoelkopf, B., Mozer, M. C., Bengio, Y., & Ke, N. R. (2021). Learning neural causal models with active interventions. NeurIPS Workshop on Causal Inference and Machine Learning (WHY-21). Also arXiv.org:2019.02429 [stat.ML]
Scott, T. R., Gallagher, A. C., & Mozer, M. C. (2021). Von Mises-Fisher loss: An exploration of embedding geometries for supervised learning. Proceedings of the IEEE/CVF International Conference on Computer Vision. Also arXiv:2103.15718 [cs.LG]
Teterwak, P., Zhang, C., Krishnan, D., & Mozer, M. C. (2021). Understanding invariance via feedforward inversion of discriminatively trained classifiers. Proceedings of the 38th International Conference on Machine Learning, PMLR 139:10225-10235. Also arXiv: 2103.07470 [cs.LG] [code: robust model] [code: non-robust model]

2020

Attarian, M., Roads, B. D., & Mozer, M. C. (2020). Transforming neural network representations to predict human judgments of similarity. Workshop on Shared Visual Representations in Human and Machine Intelligence (SVRHM 2020). Also arXiv:2010.06512 [cs.NE]
Beckage, N., Colunga, E., & Mozer, M. C. (2020). Quantifying the role of vocabulary knowledge in predicting future word learning. IEEE Transactions on Cognitive and Developmental Systems, 12, 148-159. DOI:10.1109/TCDS.2019.2928023
Davidson, G., and Mozer, M. C. (2020). Sequential mastery of multiple tasks: Networks naturally learn to learn and forget to forget. IEEE Conference on Computer Vision and Pattern Recognition, 9282-9293. Also arXiv:1905.10837 [cs.LG]
Kim, D. Y. J, Winchell, A., Waters, A. E., Grimaldi, P. J., Baraniuk, R., & Mozer, M. C. (2020). Inferring student comprehension from highlighting patterns in digital textbooks: An exploration in an authentic learning platform. In S. Sosnovsky, P. Brusilovsky, R. G. Baraniuk, & A. S. Lan (Eds.), Second Workshop on Intelligent Textbooks, Springer.
Mittal, S., Lamb, A., Goyal, A., Voleti, V., Shanahan, M., Lajoie, G., Mozer, M. C., & Bengio, Y. (2020). Learning to combine top-down and bottom-up signals in recurrent neural networks with attention over modules. International Conference on Machine Learning.
Winchell, A., Lan, A., and Mozer, M. C. (2020). Textbook highlights as an early predictor of student learning. Cognitive Science: A Multidisciplinary Journal. 44: e12901. doi:10.1111/cogs.12901
Zhang, C., Bengio, S., Hardt, M., Mozer, M. C., and Singer, Y. (2020). Identity crisis: Memorization and generalization under extreme overparameterization. In International Conference on Learning Representations (ICLR 2020). Also arXiv:1902.04698 [stat.ML]

2019

Iuzzolino, M., Singer, Y., and Mozer, M. C. (2019). Convolutional bipartite attractor networks. ArXiv:1906.03504 [cs.LG]
Lamb, A., Binas, J., Goyal, A., Subramanian, S., Mitliagkas, I., Kazakov, D., Bengio, Y., & Mozer, M. C. (2019). State-reification networks: Improving generalization by modeling the distribution of hidden representations. Proceedings of the 36th International Conference on Machine Learning, 97, 3622-3631.
Mozer, M. C., Wiseheart, M., and Novikoff T. (2019). Artificial intelligence to support human instruction. Proceedings of the National Academy of Sciences, 116 (10), 3953-3955. doi:10.1073/pnas.1900370116
Ridgeway, K., & Mozer, M. C. (2019). Open-ended content-style recombination via leakage filtering. arXiv.org:1810.00110v1 [cs.LG]
Roads, B. D., & Mozer, M. C. (2019). Obtaining psychological embeddings through joint kernel and metric learning. Behavioral Research Methods, 51, 2180-2193. doi:10.3758/s13428-019-01285-3.
Scott, T. R., Ridgeway, K., and Mozer, M. C. (2019). Stochastic prototype embeddings. ArXiv:1909.11702 [stat.ML]
Sense, F., Jastrzembski, T., Mozer, M. C., Krusmark, M., and van Rijn, H. (2019). Perspectives on computational models of learning and forgetting. Proceedings of the Seventeenth International Conference on Cognitive Modeling (53-58). State College, PA: Applied Cognitive Science Lab.

2018

Ke, N. R., Goyal, A., Bilaniuk, O., Binas, J., Mozer, M. C., Pal, C., & Bengio, Y. (2018). Sparse attentive backtracking: Temporal credit assignment through reminding. In S. Bengio et al. (Eds.), Advances in Neural Information Processing Systems 31 (pp. 7651-7662). Curran Associates.
Khajah, M. M., Mozer, M. C., Kelly, S., & Milne, B. (2018). Boosting engagement with educational software using near wins. In C. Rosé et al. (Eds.), Nineteenth International Conference on Artificial Intelligence in Education (pp. 171-175). Springer. [short version]
Lindsey, R., Daluski, A., Chopra, S., Lachapelle, A., Mozer, M., Sicular, S., Hanel, D., Gardner, M., Gupta, A., Hotchkiss, R., & Potter, H. (2018). A deep neural network improves fracture dectection by clinicians. Proceedings of the National Academy of Sciences, 115, 11591-11596. DOI: 10.1073/pnas.1806905115.
Montero, S., Arora, A., Kelly, S., Milne, B., & Mozer, M. C. (2018). Does deep knowledge tracing model interactions among skills? In K. E. Boyer & M. Yudelson (Eds.), Proceedings of the 11th International Conference on Educational Data Mining (pp. 462-466). EDM Society Press.
Mozer, M. C., Kazakov, D., & Lindsey, R. V. (2018). State denoised recurrent neural networks. arXiv:1805.08394 [cs.NE]
Ridgeway, K., & Mozer, M. C. (2018). Learning deep disentangled representations with the F-statistic loss. In S. Bengio et al. (Eds.), Advances in Neural Information Processing Systems 31 (pp. 185-194). Curran Associates. Also as arXiv:1802.05312v2 [cs.LG]
Scott, T. R., Ridgeway, K., & Mozer, M. C. (2018). Adapted deep embeddings: A synthesis of methods for k-shot inductive transfer learning. In S. Bengio et al. (Eds.), Advances in Neural Information Processing Systems 31 (pp. 76-85). Curran Associates. Also arXiv:1805.08402 [cs.LG]
Vatterott, D. B., Mozer, M. C., & Vecera, S. P. (2018). Rejecting salient distractors: Generalization from experience. Attention, Perception, and Psychophysics, 80, 485-499. DOI:10.3758/s13414-017-1465-8
Winchell, A., Mozer, M. C., Lan, A., Grimaldi, P., & Pashler, H. (2018). Can textbook annotations serve as an early predictor of student learning? In K. E. Boyer & M. Yudelson (Eds.), Proceedings of the 11th International Conference on Educational Data Mining (pp. 431-437). EDM Society Press.

2017

Kneusel, R. T., & Mozer, M. C. (2017). Improving human-machine cooperative visual search with soft highlighting. ACM Transactions on Applied Perception, 15, 3:1-3:21. Also arXiv:1612.08117 [cs.HC]
Mozer, M. C., & Lindsey, R. V. (2017). Predicting and improving memory retention: Psychological theory matters in the big data era. In M. Jones (Ed.), Big Data in Cognitive Science (pp. 34-64). New York: Routledge.
Mozer, M. C., Kazakov, D., & Lindsey, R. V. (2017). Discrete-event continuous-time recurrent networks. arXiv:1710.04110 [cs.NE].
Ridgeway, K., Mozer, M. C., & Bowles, A. (2017). Forgetting of foreign language skills: A corpus-based analysis of online tutoring software. Cognitive Science: A Multidisciplinary Journal, 41(4), 924-949. DOI: 10.1111/cogs.12385
Roads, B. D., & Mozer, M. C. (2017). Improving human-machine cooperative classification via cognitive theories of similarity. Cognitive Science: A Multidisciplinary Journal, 41, 1394-1411. DOI: 10.1111/cogs.12400. [slides from NIPS 2016 Interactive Machine Learning workshop]
Snell, J., Ridgeway, K., Liao, R., Roads, B. D., Mozer, M. C., & Zemel, R. S. (2017). Learning to generate images with perceptual similarity metrics. IEEE International Conference on Image Processing. arXiv:1511.06409v3 [cs.LG]

2016

Khajah, M., Lindsey, R. V., & Mozer, M. C. (2016). How deep is knowledge tracing? In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the Ninth International Conference on Educational Data Mining (pp. 94-101). Educational Data Mining Society Press.*Awarded Best Paper at EDM2016*
[our code for extended BKT used in the paper]
[our implementation of DKT]
Khajah, M., Roads, B. D., Lindsey, R. V., Liu, Y.-E., & Mozer, M. C. (2016). Designing engaging games using Bayesian optimization. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5571-5582). New York: ACM.
Mozer, M. C., Lindsey, R. V., & Kazakov, D. (2016). Neural Hawkes process memories. Paper in preparation. [Slides from NIPS 2016 Symposium on Recurrent Neural Networks]
Roads, B. D., Mozer, M. C., & Busey, T. A. (2016). Using highlighting to train attentional expertise. PLoS ONE 11(1): e0146266. doi:10.1371/journal.pone.0146266
Wilson, K.H., Xiong, X., Khajah, M., Lindsey, R.V., Zhao, S., Karklin, Y., Van Inwegen, E.G., Han, B., Ekanadham, C., Beck, J.E., Heffernan, N., & Mozer, M.C. (2016). Estimating student proficiency: Deep learning is not the panacea. In R. G. Baraniak, J. Ngiam, C. Studer, P. Grimaldi, & A. S. Lan (Eds.), Proceedings of the 2016 NIPS Workshop on Machine Learning for Education. [slides from NIPS 2016 Machine Learning for Education workshop]

2015

Beckage, N., Mozer, M. C., & Colunga, E. (2015). Predicting a child's trajectory of lexical acquisition. In D. C. Noelle et al. (Eds.), Proceedings of the 37th Annual Conference of the Cognitive Science Society (pp.196-201). Austin, TX: Cognitive Science Society.

2014

Kang, S. H. K., Lindsey, R. V., Mozer, M. C., & Pashler, H. (2014). Retrieval practice over the long term: Should spacing be expanding or equal-interval? Psychonomic Bulletin & Review, 21, 1544-50.
Khajah, M., Huang, Y., Gonzales-Brenes, J. P., Mozer, M. C., & Brusilovsky, P. (2014). Integrating knowledge tracing and item response theory: A tale of two frameworks. In M. Kravcik, O. C. Santos, J. G. Boticario (Eds.), Proceedings of the 4th International Workshop on Personalization Approaches in Learning Environments (pp. 7-15). CEUR Workshop Proceedings, ISSN 1613-0073.
Khajah, M., Lindsey, R., & Mozer, M. C. (2014). Maximizing students' retention via spaced review: Practical guidance from computational models of memory. Topics in Cognitive Science, 6, 157-169. *Awarded Cognitive Modeling Prize at CogSci2013*
Khajah, M., Wing, R. M., Lindsey, R. V., & Mozer, M. C. (2014) Incorporating latent factors into knowledge tracing to predict individual differences in learning. In J. Stamper, Z. Pardos, M. Mavrikis, & B. M. McLaren (Eds), Proceedings of the 7th International Conference on Educational Data Mining (pp. 99-106). Educational Data Mining Society Press. *Awarded Best Paper at EDM2014*
Lindsey, R. V., & Mozer, M. C. (2014). Predicting individual differences in student learning via collaborative filtering.
Lindsey, R. V., Khajah, M., & Mozer, M. C. (2014). Automatic discovery of cognitive skills to improve the prediction of student learning. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 1386-1394). La Jolla, CA: Curran Associates Inc.
[our code for BKT with skill assignments]
Lindsey, R. V., Shroyer, J. D., Pashler, H., & Mozer, M. C. (2014). Improving student's long-term knowledge retention with personalized review. Psychological Science, 25, 639-647. doi: 10.1177/0956797613504302.

2013

Chukoskie, L., Snider, J., Mozer, M. C., Krauzlis, R. J., & Sejnowski, T. J. (2013). Learning where to look: An empirical, computational, and theoretical account of hidden target search performance. Proceedings of the National Academy of Sciences, 110, 10438-445.
Jones, M., Curran, T., Mozer, M. C., & Wilder, M. H. (2013). Sequential effects in response time reveal learning mechanisms and event representations. Psychological Review, 120, 628-666.
Lindsey, R., Mozer, M. C., Huggins, W. J., & Pashler, H. (2013). Optimizing instructional policies. In C.J.C. Burges et al. (Eds.), Advances in Neural Information Processing Systems 26 (pp.2778-2786). La Jolla, CA: Curran Associates, Inc.
Pashler, H., & Mozer, M. C. (2013). When does fading enhance perceptual category learning? Journal of Experimental Psychology: Learning, Memory, and Cognition, 39, 1162-73.
Pashler, H., Kang, S., & Mozer, M. C. (2013). Reviewing erroneous information facilitates memory updating. Cognition, 128(3), 424-430.
Wilder, M. H., Jones, M., Ahmed, A., Curran, T., & Mozer, M. C. (2013). The persistent impact of incidental experience. Psychonomic Bulletin and Review, 20, 1221-1231.

2012

Doshi, A., Tran, C., Wilder, M., Mozer, M. C., & Trivedi, M. M. (2012). Sequential dependencies in driving. Cognitive Science, 36, 948-963.
Lee, H., Mozer, M. C., Kramer, A. F., & Vecera, S. P. (2012). Object-based control of attention is sensitive to recent experience. Journal of Experimental Psychology: Human Perception and Performance, 38, 314-325.
Lindsey, R., Polsdofer, E., Mozer, M.C., Kang, S., H., K., & Pashler, H. (2012). Long-term recency is nothing more than ordinary forgetting. Unpublished manuscript.
Mozer, M. C., Pashler, H., Lindsey, R. V., & Jones, J. (2012). Efficient training of visual search via attentional highlighting. Unpublished manuscript.

2011

Kang, S. H. K., Pashler, H., Cepeda, N. J., Rohrer, D., Carpenter, S. K., & Mozer, M. C. (2011). Does incorrect guessing impair fact learning? Journal of Educational Psychology, 103, 48-59.
Kinoshita, S., Mozer, M. C., & Forster, K. I. (2011). Dynamic adaptation to history of trial difficulty explains the effect of congruency proportion on masked priming. Journal of Experimental Psychology: General. 140, 622-636.
Link, B. V., Kos, B., Wager, T. D., & Mozer, M. C. (2011). Past experience influences judgment of pain: Prediction of sequential dependencies. In L. Carlson, C. Hoelscher, & T. F. Shipley (Eds.), Proceedings of the 33d Annual Conference of the Cognitive Science Society (pp. 1248-1253). Austin, TX: Cognitive Science Society.
Mozer, M. C., Link, B. V., & Pashler, H. (2011). An unsupervised decontamination procedure for improving the reliability of human judgments. In Shawe-Taylor, J., Zemel, R. S., Bartlett, P., Pereira, & Weinberger, K. Q. (Eds.), Advances in Neural Information Processing Systems 24 (pp. 1791-1799). La Jolla, CA: NIPS Foundation.
Wilder, M. H., Mozer, M. C., & Wickens, C. D. (2011). An integrative, experience-based theory of attentional control. Journal of Vision, 11, 1-30.

2010

Lindsey, R., Lewis, O., Pashler, H., & Mozer, M. C. (2010). Predicting students' retention of facts from feedback during training. In S. Ohlsson & R. Catrambone (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society (pp. 2332-2337). Austin, TX: Cognitive Science Society.
Mozer, M. C., Pashler, H., Wilder, M., Lindsey, R., Jones, M. C., & Jones, M. N. (2010). Decontaminating human judgments to remove sequential dependencies. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, & A. Culota (Eds.), Advances in Neural Information Processing Systems 23 (pp. 1705-1713). La Jolla, CA: NIPS Foundation.
Wilder, M. H., Ahmed, A. A., Mozer, M. C., & Jones, M. (2010). Sequential effects in motor adaptation: The importance of far back trials. Poster presentation at Society For Neuroscience. San Diego, CA, November 15, 2010.
Wilder, M., Jones, M., & Mozer, M. C. (2010). Sequential effects reflect parallel learning of multiple environmental regularities. In Y. Bengio, D. Schuurmans, J. Lafferty, C.K.I. Williams, & A. Culotta (Eds.), Advances in Neural Information Processing Systems 22 (pp. 2053-2061). La Jolla, CA: NIPS Foundation.

2009

Cepeda, N. J., Coburn, N., Rohrer, D., Wixted, J. T., Mozer, M. C., & Pashler, H. (2009). Optimizing distributed practice: Theoretical analysis and practical implications. Experimental Psychology, 56, 236-246.
Jones, M., Mozer, M. C., & Kinoshita, S. (2009). Optimal response initiation: Why recent experience matters. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in Neural Information Processing Systems 21, 785-792.
Knights, D., Mytkowicz, T., Sweeney, P. F., Mozer, M. C., & Diwan, A. (2009). Blind optimization for exploiting hardware features. In O. de Moor & M. I. Schwartzbach (Eds.), Lecture Notes in Computer Science, v. 5501: Compiler Construction 2009 (pp. 251-265). New York: Springer.
Lee, H., Mozer, M. C., & Vecera, S. P. (2009). Mechanisms of priming of pop-out: Stored representations or feature-gain modulations? Attention, Perception, & Psychophysics, 71, 1059-1071.
Lindsey, R., Mozer, M. C., Cepeda, N. J., & Pashler, H. (2009). Optimizing memory retention with cognitive models. In A. Howes, D. Peebles, & R. Cooper (Eds.), Proceedings of the Ninth International Conference on Cognitive Modeling (ICCM). Manchester, UK.
Mozer, M. C. (2009). Attractor networks. In P. Wilken, A. Cleeremans, & T. Bayne (Eds.), Oxford Companion to Consciousness(pp. 86-89). Oxford U. Press.
Mozer, M. C., & Wilder, M. H. (2009). A unified theory of exogenous and endogenous attentional control. In D. Heinke & E. Mavritsaki (Eds.), Computational modeling in behavioral neuroscience: Closing the gap between neurophysiology and behaviour (pp. 245-265). London: Psychology Press.
Mozer, M. C., Pashler, H., Cepeda, N., Lindsey, R., & Vul, E. (2009). Predicting the optimal spacing of study: A multiscale context model of memory. In Y. Bengio, D. Schuurmans, J. Lafferty, C.K.I. Williams, & A. Culotta (Eds.), Advances in Neural Information Processing Systems 22 (pp. 1321-1329). La Jolla, CA: NIPS Foundation.
Reynolds, J., & Mozer, M. C. (2009). Temporal dynamics of cognitive control. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in Neural Information Processing Systems 21, 1353-1360.

2008

Kinoshita, S., Forster, K. I., & Mozer, M. C. (2008). Unconscious cognition isn't that smart: Modulation of masked repetition priming effect in the word naming task. Cognition, 107, 623-649.
Mozer, M. C., & Baldwin, D. S. (2008). Experience-guided search: A theory of attentional control. In J. Platt, D. Koller, & Y. Singer (Eds.), Advances in Neural Information Processing Systems 20 (pp. 1033-1040). Cambridge, MA: MIT Press
Mozer, M. C., & Fan, A. (2008). Top-down modulation of neural responses in visual perception: A computational exploration. Natural Computing, 7, 45-55.
Mozer, M. C., Pashler, H., & Homaei, H. (2008). Optimal predictions in everyday cognition: The wisdom of individuals or crowds? Cognitive Science: A Multidisciplinary Journal, 32, 1133-1147.

2007

Bohte, S., & Mozer, M. C. (2007). A computational theory of spike-timing dependent plasticity: Achieving robust neural responses via response variability minimization. Neural Computation, 19, 371-403.
Hochreiter, S., & Mozer, M. C. (2007). Monaural speech separation by support vector machines: Bridging the divide between supervised and unsupervised learning methods. In Conference on Blind Signal Separation.
Mozer, M. C., Jones, M., & Shettel, M. (2007). Context effects in category learning: An investigation of four probabilistic models. Neural Information Processing Systems 19. Cambridge, MA: MIT Press.
Mozer, M. C., Kinoshita, S., & Shettel, M. (2007). Sequential dependencies offer insight into cognitive control. In W. Gray (Ed.), Integrated Models of Cognitive Systems (pp. 180-193). Oxford University Press.
Richardson, S., Otte, M., Mozer, M. C., Diwan, A., Sweeney, P., Connors, D., & Lacovara, K. (2007). Discovering the runtime structure of software with probabilistic generative models.

2006

Baldwin, D., & Mozer, M. C. (2006). Controlling attention with noise: The cue-combination model of visual search. In R. Sun & N. Miyake (Eds.), Proceedings of the Twenty Eighth Annual Conference of the Cognitive Science Society (pp. 42-47). Hillsdale, NJ: Erlbaum Associates.
Kinoshita, S., & Mozer, M. C. (2006). How lexical decision is affected by recent experience: Symmetric versus asymmetric frequency blocking effects. Memory and Cognition, 34, 726-742.
Mozer, M. C., Shettel, M., & Vecera, S. P. (2006). Control of visual attention: A rational account. In Y. Weiss, B. Schoelkopf, & J. Platt (Eds.), Neural Information Processing Systems 18 (pp. 923-930). Cambridge, MA: MIT Press.

2005

Bohte, S., & Mozer, M. C. (2005). Reducing spike train variability: A computational theory of spike-timing dependent plasticity. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in Neural Information Processing Systems 17 (pp. 201-208). Cambridge, MA: MIT Press.
Colagrosso, M. D., & Mozer, M. C. (2005). Theories of access consciousness. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in Neural Information Processing Systems 17 (pp. 289-296). Cambridge, MA: MIT Press.
Hauswirth, M., Diwan, A., Sweeney, P. F., & Mozer, M. C. (2005). Automated vertical profiling. In 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'05).
Mozer, M. C. (2005). Lessons from an adaptive house. In D. Cook & R. Das (Eds.), Smart environments: Technologies, protocols, and applications (pp. 273-294). Hoboken, NJ: J. Wiley & Sons.
Mozer, M. C., & Vecera, S. P. (2005). Object- and space-based attention. In L. Itti, G. Rees, & J. Tsotsos (Eds.), The encyclopedia of the neurobiology of attention (pp. 130-134). Elsevier Press.
Mozer, M. C., Mytkowicz, T., & Zemel, R. S. (2005). Stimulus-specific adaptation of neural responses: Insights from neurophysiology and computational models. Poster presented at the Cognitive Neuroscience Conference. New York City, April 2005.

2004

Mozer, M. C., Howe, M., & Pashler, H. (2004). Using testing to enhance learning: A comparison of two hypotheses. Proceedings of the Twenty Sixth Annual Conference of the Cognitive Science Society (pp. 975-980). Hillsdale, NJ: Erlbaum Assoccciates.
Mozer, M. C., Kinoshita, S., & Davis, C. (2004). Control of response initiation: Mechanisms of adaptation to recent experience. Proceedings of the Twenty Sixth Annual Conference of the Cognitive Science Society (pp. 981-986). Hillsdale, NJ: Erlbaum Assoccciates.
Mozer, M. C., Mytkowicz, T., & Zemel, R. S. (2004). Achieving robust neural representations: An account of repetition suppression. Unpublished manuscript.

2003

Mozer, M. C., Colagrosso, M. D., & Huber, D. E. (2003). Mechanisms of long-term repetition priming and skill refinement: A probabilistic pathway model. In Proceedings of the Twenty Fifth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Erlbaum Associates.
Yan, L., Dodier, R., Mozer, M. C., & Wolniewicz, R. (2003). Optimizing classifier performance via the Wilcoxon-Mann-Whitney statistic. In Proceedings of the International Conference on Machine Learning (ICML) (pp. 848-855).

2002

Mozer, M. C. (2002). Frames of reference in unilateral neglect and spatial attention: A computational perspective. Psychological Review, 109, 156-185.
Mozer, M. C., Colagrosso, M. D., & Huber, D. H. (2002). A rational analysis of cognitive control in a speeded discrimination task. In T. Dietterich, S. Becker, & Ghahramani, Z. (Eds.) Advances in Neural Information Processing Systems 11V (pp. 51-57). Cambridge, MA: MIT Press.
Mozer, M. C., Dodier, R., Colagrosso, M. D., Guerra-Salcedo, C., & Wolniewicz, R. (2002). Prodding the ROC curve: Constrained optimization of classifier performance. In T. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems 14 (pp. 1409-1415). Cambridge, MA: MIT Press.
Pashler, H., Mozer, M. C., & Harris, C. R. (2002). Mating strategies in a Darwinian microworld: Simulating the consequences of female reproductive refractoriness. Adaptive Behavior, 9, 5-15.
Zemel, R. S., Behrmann, M., Mozer, M. C., & Bavelier, D. (2002). Experience-dependent perceptual grouping and object-based attention. Journal of Experimental Psychology: Human Perception and Performance, 28, 202-217.

2001

Grimes, D., & Mozer, M. C. (2001). The interplay of symbolic and subsymbolic processes in anagram problem solving. In T. K. Leen, T. Dietterich, & V. Tresp (Eds.), Advances in Neural Information Processing Systems 13 (pp. 17-23). Cambridge, MA: MIT Press.
Hochreiter, S., & Mozer, M. C. (2001). A discrete probabilistic memory model for discovering dependencies in time. In G. Dorffner, H. Bischof, & K. Hornig (Eds.), Proceedings of the International Conference on Artificial Neural Networks (ICANN). Springer-Verlag.
Hochreiter, S., & Mozer, M. C. (2001). Beyond maximum likelihood and density estimation: A sample-based criterion for unsupervised learning of complex models. In T. K. Leen, T. Dietterich, & V. Tresp (Eds.), Advances in Neural Information Processing Systems 13 (pp. 535-541). Cambridge, MA: MIT Press.
Hochreiter, S., & Mozer, M. C. (2001). Monaural separation and classification of mixed signals: A support-vector regression perspective. Proceedings of the Third International Conference on Independent Component Analysis and Blind Signal Separation, San Diego, CA.
Zemel, R. S., & Mozer, M. C. (2001). Localist attractor networks. Neural Computation, 13, 1045-1064.

2000

Behrmann, M., Zemel, R. S., Mozer, M. C. (2000). Occlusion, symmetry, and object-based attention: Reply to Saiki (2000). Journal of Experimental Psychology: Human Perception and Performance, 26, 1497-1505.
Lee, S.-Y., & Mozer, M. C. (2000). Robust recognition of noisy and superimposed patterns via selective attention. In S. A. Solla, T. K. Leen & K.-R. Mueller (Eds.), Advances in Neural Information Processing Systems 12 (pp. 31-37). Cambridge, MA: MIT Press.
Mozer, M. C., & the Athene Advanced Technology Group. (2000). Prediction and classification. Pittfalls for the unwary.
Mozer, M. C., Wolniewicz, R., Grimes, D., Johnson, E., & Kaushanksy, H. (2000). Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry. IEEE Transactions on Neural Networks, 11, 690-696.
Sitton, M., Mozer, M. C., & Farah, M. (2000). Superadditive effects of multiple lesions in a connectionist architecture: Implications for the neuropsychology of optic aphasia, Psychological Review, 107, 709-734.

1999

Alexander, J. A., & Mozer, M. C. (1999). Template-based procedures for neural network interpretation. Neural Networks, 12, 479-498.
Mozer, M. C. (1999). A principle for unsupervised hierarchical decomposition of visual scenes. In M. S. Kearns, S. A. Solla, & D. Cohn (Eds.), Advances in Neural Information Processing Systems 11 (pp. 52-58). Cambridge, MA: MIT Press.
Mozer, M. C. (1999). An intelligent environment must be adaptive. IEEE Intelligent Systems and their Applications, 14(2) , 11-13.
O'Reilly, R. C., Mozer, M. C., Munakata, Y., & Miyake, A. (1999). Discrete representations in working memory: A hypothesis and computational investigations. In Proceedings of the Second International Conference on Cognitive Science (pp. 183-188). Tokyo, Japan: Japanese Cognitive Science Society.

1998

Behrmann, M., Zemel, R. S., and Mozer, M. C. (1998). Object-based attention and occlusion: Evidence from normal subjects and a computational model. Journal of Experimental Psychology: Human Perception and Performance, 24, 1011-1036.
Das, S., & Mozer, M. C. (1998). Dynamic on-line clustering and state extraction: An approach to symbolic learning. Neural Networks, 11, 53-64.
Mozer, M. C. (1998). The neural network house: An environment that adapts to its inhabitants. In M. Coen (Ed.), Proceedings of the American Association for Artificial Intelligence Spring Symposium on Intelligent Environments (pp. 110-114). Menlo, Park, CA: AAAI Press.
Mozer, M. C., & Miller, D. (1998). Parsing the stream of time: The value of event-based segmentation in a complex, real-world control problem. In C. L. Giles & M. Gori (Eds.), Adaptive processing of temporal sequences and data structures (pp. 370-388). Berlin: Springer Verlag.
Mozer, M. C., & Sitton, M. (1998). Computational modeling of spatial attention. In H. Pashler (Ed.), Attention (pp. 341-393). London: UCL Press.

1997

Calder, B., Grunwald, D., Jones, M., Lindsay, Dsdlkjlk., Martin, J., Mozer, M., & Zorn, B. (1997). Evidence-based static branch prediction using machine learning. ACM Transactions on Programming Languages and Systems, 19, 188-222.
Mozer, M. C., Halligan, P. W., & Marshall, J. C. (1997). The end of the line for a brain-damaged model of unilateral neglect. Cognitive Neuroscience, 9, 171-190.
Mozer, M. C., Vidmar, L., & Dodier, R. H. (1997). The Neurothermostat: Predictive optimal control of residential heating systems. In M. C. Mozer, M. I. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems 9 (pp. 953-959). Cambridge, MA: MIT Press.
Uno, Y., & Mozer, M. C. (1997). Neural net architectures in modeling compositional syntax: Prediction and perception of continuity in minimalist works by Phillip Glass and Louis Andriessen. Proceedings of the International Computer Music Conference, Greece.

1996

Mathis, D., & Mozer, M. C. (1996). Conscious and unconscious perception: A computational theory. In G. Cottrell (Ed.), Proceedings of the Eighteenth Annual Conference of The Cognitive Science Society (pp. 324-328). Hillsdale, NJ: Erlbaum.
Mozer, M. C. (1996). Neural network speech processing for toys and consumer electronics. IEEE Expert, 11, 4-5.

1995

Mathis, D. A., & Mozer, M. C. (1995). On the computational utility of consciousness. In G. Tesauro, D. S. Touretzky, & T. K. Leen (Eds.), Advances in Neural Information Processing Systems 7 (pp. 10-18). Cambridge, MA: MIT Press.
Mozer, M. C., Dodier, R. H., Anderson, M., Vidmar, L., Cruickshank III, R. F., & Miller, D. (1995). The neural network house: An overview. In L. Niklasson & M. Boden (Eds.), Current trends in connectionism (pp. 371-380). Hillsdale, NJ: Erlbaum.
Zemel, R. S., Williams, C. K. I., & Mozer, M. C. (1995). Lending direction to neural networks. Neural Networks, 7, 565-579.

1994

Das, S., & Mozer, M. C. (1994). A unified gradient-descent/clustering architecture for finite-state machine induction. In J. D. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in Neural Information Processing Systems 6 (pp. 19-26). San Mateo, CA: Morgan Kaufmann Publishers.
Dodier, R. H., Lukianow, D., Ries, J., & Mozer, M. C. (1994). A comparison of neural net and conventional techniques for lighting control. Applied Mathematics and Computer Science, 4, 447-462.
Mozer, M. C. (1994). Neural network music composition by prediction: Exploring the benefits of psychophysical constraints and multiscale processing. Connection Science, 6, 247-280. [FOLLOW LINK FOR AUDIO SAMPLES]

1993

Bonnlander, B. V., & Mozer, M. C. (1993). Metamorphosis networks: An alternative to constructive methods. S. J. Hanson, J. D. Cowan, & C. L. Giles (Eds.), Advances in Neural Information Processing Systems 5 (pp. 131-138). San Mateo, CA: Morgan Kaufmann Publishers.
Mozer, M. C. (1993). Neural network architectures for temporal pattern processing. In A. S. Weigend & N. A. Gershenfeld (Eds.), Time series prediction: Forecasting the future and understanding the past (pp. 243-264). Redwood City, CA: Sante Fe Institute Studies in the Sciences of Complexity, Proceedings Volume XVII, Addison-Wesley Publishing.
Mozer, M. C., & Das, S. (1993). A connectionist symbol manipulator that discovers the structure of context-free languages. In S. J. Hanson, J. D. Cowan, & C. L. Giles (Eds.), Advances in Neural Information Processing Systems 5 (pp. 863-870). San Mateo, CA: Morgan Kaufmann Publishers.
Schmidhuber, J. H., Mozer, M. C., & Prelinger, D. (1993). Continuous history compression. In H. Huening, S. Neuhauser, M. Raus, & W. Ritschel (Eds.), Workshop on Neural Networks (pp. 87-95). Aachen: Augustinus.

1992

McMillan, C., Mozer, M. C., & Smolensky, P. (1992). Rule induction through integrated symbolic and subsymbolic processing. In J. E. Moody, S. J. Hanson, & R. P. Lippmann (Eds.), Advances in neural information processing systems 4 (pp. 969-976). San Mateo, CA: Morgan Kaufmann.
Mozer, M. C. (1992). Induction of multiscale structure. In J. E. Moody, S. J. Hanson, & R. P. Lippmann (Eds.), Advances in neural information processing systems 4 (pp. 275-282). San Mateo, CA: Morgan Kaufmann.
Mozer, M. C., & Behrmann, M. (1992). Reading with attentional impairments: A brain-damaged model of neglect and attentional dyslexias. In R. G. Reilly & N. E. Sharkey (Eds.), Connectionst Approaches to Language Processing (pp. 409-460). Hillsdale, NJ: Earlbaum.
Mozer, M. C., Zemel, R. S., Behrmann, M., & Williams, C. K. I. (1992). Learning to segment images using dynamic feature binding. Neural Computation, 4, 650-665.

1991

McMillan, C., Mozer, M. C., & Smolensky, P. (1991). The connectionist scientist game: Rule extraction and refinement in a neural network. Proceedings of the Thirteenth Annual Conference of the Cognitive Science Society (pp. 424-430). Hillsdale, NJ: Erlbaum.
Mozer, M. C. (1991). Discovering discrete distributed representations with iterative competitive learning. In R. P. Lippmann, J. Moody, and D. S. Touretzky (Eds.), Advances in neural information processing systems 3 (pp. 627-634). San Mateo, CA: Morgan Kaufmann.

1990

Mozer, M. C. (1990). The perception of multiple objects: A connectionist approach. Cambridge, MA: MIT Press.
Mozer, M. C. (1990). Discovering faithful Wickelfeature representations in a connectionist network. Proceedings of the Twelfth Annual Conference of the Cognitive Science Society (pp. 356-363). Hillsdale, NJ: Erlbaum.
Mozer, M. C., & Behrmann, M. (1990). On the interaction of selective attention and lexical knowledge: A connectionist account of neglect dyslexia. Journal of Cognitive Neuroscience, 2, 96-123.
Zemel, R. S., Mozer, M. C., & Hinton G. E. (1990). TRAFFIC: Object recognition using hierarchical reference frame transformations. In D. Touretzky (Ed.), Advances in neural information processing systems 2 (pp. 266-273). San Mateo, CA: Morgan Kaufmann.

1989

Mozer, M. C. (1989). A focused backpropagation algorithm for temporal pattern recognition. Complex Systems, 3, 349-381.
Mozer, M. C. (1989). Types and tokens in visual word perception. Journal of Experimental Psychology: Human Perception and Performance, 15, 287-303.
Mozer, M. C., & Smolensky, P. (1989). Using relevance to reduce network size automatically. Connection Science, 1, 3-16.

1988

Mozer, M. C. (1988). A connectionist model of selective attention in visual perception. In V. L. Patel & G. J. Groen (Eds.), Proceedings of the Tenth Annual Conference of the Cognitive Science Society (pp. 195-201). Hillsdale, NJ: Erlbaum Associates.

1987

Mozer, M. C. (1987). RAMBOT: A connectionist expert system that learns by example. In Proceedings of the IEEE First Annual Conference on Neural Networks (pp. 693-700). San Diego: IEEE Publishing Services.

1986

McClelland, J. L., and Mozer, M.C. (1986). Perceptual interactions in two-word displays: Familiarity and similarity effects. Journal of Experimental Psychology: Human Perception and Performance, 12, 18-35.
Mozer, M. C., & Gross, K. P. (1986). An architecture for experiential learning. In T. M. Mitchell, J. G. Carbonell, R. S. Michalski (Eds.), Machine learning: A guide to current research (pp. 219-226). Boston: Kluwer Academic.

1983

Mozer, M. C. (1983). Letter migration in word perception. Journal of Experimental Psychology: Human Perception and Performance, 9, 531-546.

Michael C. Mozer

Professor

Department of Computer Science and

Institute of Cognitive Science

University of Colorado, Boulder

Professor Mozer has moved to Google DeepMind. He maintains an affiliation with the University but he will no longer be available for University committees and service.

Selected Publications

Sort by Topic

Sort by Year

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

1989

1988

1987

1986

1983