Syllabus
Probabilistic Models of
Human and Machine Intelligence

CSCI 7222
Fall 2013

Tu, Th 14:00-15:15
ECCR 151

Instructor

Professor Michael Mozer
Department of Computer Science
Engineering Center Office Tower 741
(303) 492-4103
Office Hours:  Tu 15:30-16:30, Th 13:00-13:45

Course Objectives

A new paradigm has emerged in cognitive science and artificial intelligence which views the mind as a computer extraordinarily tuned to the statistics of the environment in which it operates, and views learning and adaptation in terms of changes to these statistics over time. The goal of the course is to understand the latest advances in theory in cognitive science and artificial intelligence that take a statistical and probabilistic perspective.

One virtue of probabilistic models is that they straddle the gap between cognitive science, artificial intelligence, and machine learning. The same methodology is useful for both understanding the brain and building intelligent computer systems.  Indeed, for much of the research we'll discuss, the models contribute both to machine learning and to cognitive science.  Whether your primary interest is in engineering applications of machine learning or in cognitive modeling, you'll see that there's a lot of itnerplay between the two fields.

The course participants are likely to be a diverse group of students, some with primarily an engineering/CS focus and others primarily interested in cognitive modeling (building computer simulation and mathematical models to explain human perception, thought, and learning).

Prerequisites

The course is open to any students who have some background in cognitive science or artificial intelligence and who have taken an introductory probability/statistics course.  If your background in probability/statistics is weak, you'll have to do some catching up with the text.

Course Readings

We will be using a text by David Barber (Bayesian Reasoning And Machine Learning, Cambridge University Press, 2012. The author has made available an electronic version of the text. Note that the electronic version is a 2013 revision.

For additional references, wikipedia is often a useful resource.  The pages on various probability distributions are great references. If you want additional reading, I recommend the following  texts:
We will also be reading research articles from the literature, which can be downloaded from the links on the class-by-class syllabus below.

Course Discussions

We will use Piazza for class discussion.  Rather than emailing me, I encourage you to post your questions on Piazza. This is my first experience with Piazza but I will strive to respond quickly. If I do not, please email me personally.  The Piazza class page is: https://piazza.com/colorado/fall2013/csci7222/home

Course Requirements

Readings

In the style of graduate seminars, your will be responsible to read chapters from the text and research articles before class and be prepared to come into class to discuss the material (asking clarification questions, working through the math, relating papers to each other, critiquing the papers, presenting original ideas related to the paper).

Homework Assignments

We can all delude ourselves into believing we understand some math or algorithm by reading, but implementing and experimenting with the algorithm is both fun and valuable for obtaining a true understanding.  Students will implement small-scale versions of as many of the models we discuss as possible.  I will give about 10 homework assignments that involve implementation over the semester, details to be determined. My preference is for you to work in matlab, both because you can leverage software available with the Barber text, and because matlab has become the de facto work horse in machine learning.  For one or two assignments, I'll ask you to write a one-page commentary on a research article.

Semester Grades

Semester grades will be based 5% on class attendance and participation and 95% on the homework assignments.  I will weight the assignments in proportion to their difficulty, in the range of 5% to 10% of the course grade.  Students with backgrounds in the area and specific expertise may wish to do in-class presentations for extra credit.

Class-By-Class Plan and Course Readings

The greyed out portion of this schedule is tentative and will be adjusted as the semester goes on. I may adjust assignments, assignment dates, and lecture topics based on the class's interests.

Date Activity Required Reading
(Section numbers refer to Barber)
Optional Reading Lecture Notes Assignments
Aug 27 introductory meeting 29.1 (Appendix in hardcopy edition),
13.1-13.3

Chater, Tenenbaum, & Yuille (2006)
lecture Assignment 0
Aug 29 basic probability, Bayes rule 1.1-1.4 Griffiths & Yuille (2006) lecture
Sep 3 continuous distributions
8.1-8.3
lecture


Sep 5 concept learning,
Bayesian Occam's razor
12.1-12.3 (requires a bit of probability we haven't talked about, so don't sweat the details) Tenenbaum (1999)
Jefferys & Berger (1991)
lecture Assignment 1
Sep 10 Gaussians
8.4-8.7 useful reference: Murphy (2007)

lecture
Sep 12 UNIVERSITY CLOSED:
STAY DRY
Sep 17 motion illusions as optimal percepts
Weiss, Simoncelli, Adelson (2002) motion demo 1
motion demo 2
lecture Assignment 2
Sep 19 Bayesian statistics (conjugate priors, hierarchical Bayes) 9.1 lecture
Sep 24 Bayes nets: Representation 2.1-2.3, 3.1-3.5
Cowell (1999)
Jordan & Weiss (2002)
4.1-4.6
lecture
Assignment 3
Sep 26 Bayes nets: Exact Inference

5.1-5.5 Huang & Darwiche, (1994)

lecture
Oct 1 Assignment 4
Oct 3 Bayes nets: Approximate inference 27.1-27.6 Andrieu et al. (2003) lecture
Oct 8
Oct 10 Learning I: Parameter learning
9.2-9.4 Heckerman (1995)
9.5
lecture Assignment 5
Oct 15 Learning II: Missing data, latent variables, EM, GMM 11.1-5, 20.2-3 lecture
Oct 17 text mining
latent Dirichlet allocation
20.6 Griffiths, Steyvers & Tenenbaum (2007)

Blei, Ng, & Jordan (2003)

video tutorial on Dirichlet Processes by Teh or Teh introductory paper
lecture
Oct 22 text mining
Inferring social networks  
McCallum, Corrado-Emmanuel, & Wang (2005)
lecture

Assignment 6
Oct 24 text mining
nonparametric Bayes
Orbanz & Teh (2010)
lecture
Oct 29 text mining
hierarchical models
Teh (2006)
lecture 
Oct 31 catch up day



Nov 5 sequential models
hidden markov models
23.1-23.3 Gharamani (2001) lecture
Assignment 7
Nov 7 sequential models
conditional random fields
23.4-23.5 Sutton & McCallum
Mozer et al. (2010)
Lafferty, McCallum, Pereira (2001)
lecture
Nov 12 final project 21.1-21.2, 22.1-22.2 lecture Assignments 8 and 9
Nov 14 sequential models
sequential dependencies (Matt Wilder guest lecturer)
Yu & Cohen (2009) Wilder, Jones, & Mozer (2010)
Nov 19 sequential models
exact and approximate inference (particle filters, changepoint detection)
[Janeen presents]
27.6
Adams & MacKay (2008)

ppt
pdf
Nov 21 sequential models
Kalman filters
[Ian, David, Matt present]
24.1-24.4 Koerding, Tenenbaum, & Shadmehr (2007)
24.5
lecture
Dec 3 Gaussian processes 19.1-19.5 lecture1
lecture2
Dec 5 vision/attention
search [Arafat presents]
Mozer & Baldwin (2008)
Najemnik & Geisler, (2005)
supplemental material for Najemnik & Geisler lecture
lecture

Dec 10 NO CLASS  [Mozer at NIPS conference]
Dec 12 Deep learning part 1 pptx
Dec 14 13:30-16:00 Final project presentations

Queue


Poon & Domingos (2011) Sum-Product Networks: A new deep architecture.

Gens & Domingos (2012). Discriminative learning of sum-product networks.

Ullman, T.D., Baker, C.L., Macindoe, O., Evans, O., Goodman, N.D., & Tenenbaum, J.B. (2010). Help or hinder: Bayesian models of social goal inference. Advances in Neural Information Processing Systems (Vol. 22, pp. 1874-1882).

Baker, C.L., Saxe, R., & Tenenbaum, J.B. (2009). Action Understanding as Inverse Planning. Cognition, 113, 329-349. [Supplementary material].

Kemp & Tenenbaum, PNAS, Discovery of Structural Form

Peter Welinder, Steve Branson, Serge Belongie, Pietro Perona
The Multidimensional Wisdom of Crowds


The Wisdom of Crowds in the Recollection of Order Information (2009)
Mark Steyvers, Michael Lee, Brent Miller, Pernille Hemmer

Interesting Links

Tutorials

Modeling tools

UCI Topic modeling toolbox (requires 32-bit matlab)
Mallet (machine learning for language, Java based implementation of topic modeling)
Mahout (Java API that does topic modeling)
C implementatoni of topic models
windows executable of C implementation  (runs from the command line)
Stanford Topic Modeling Toolkit
UCLA's samiam
Murphy's probabilistic modeling  toolbox
BUGS
OpenBayes
Orange
Bayesian reasoning and machine learning software in matlab (associated with David Barber's book)
Chris DeHoust comments on software

Additional information for students (click to read)