I want you to investigate probabilistic
programming languages and identify one that you want to work
with. Install the software, and run through one or more
tutorial examples to convince yourself that you understand
basically how the language works. I list a bunch of options on
the course home page, and there are even more at
probabilistic-programming.org.
I have no expertise in these languages but after spending a few
days looking at the options, there are five I'd suggest to
investigate further. The first three are likely to be the most
valuable in the future, because they allow for the integration
of Bayesian methods and neural networks. The field is definitely
headed in this direction. Each of these three languages is built
on top of a gradient-based optimization library, with efficient
GPU operations for multidimensional array. The five languages
I'd recommend, roughly ordered from strongest to weakest
recommendation, are:
- PyMC3: I really
like this language. It seems fairly intuitive and easy to
translate models to code. Like pyro and Edward, it is built
on top of an optimization library, theano. Unfortunately,
theano is no longer being developed. Fortunately, there is
a promise that PyMC3 will be incorporated into tensorflow
and torch.
-
TensorFlow ProbabilityBrand new library within the TensorFlow
ecosystem. May be too immature for use today, but wait a few
months.
http://edwardlib.org">Edward: Robust
- Edward: Robust
language, well written documentation, built on top of
tensorflow. If you know tensorflow, Edward is the way to go.
Tensorflow will have a long life so it's probably worth
learning.
Here is the manual.
- pyro: This is a relatively
new language that was released by Uber. It sits on top
of pytorch. Pyro has a well written tutorial, though the
language doesn't seem as intuitive to me as Edward. The
advantage is that pytorch is more natural than tensorflow.
- OpenBUGS:
a very popular language before torch/tensorflow/theano came
along. The BUGS code is quite readable and maps closely to
notation we've used in class. However, the documentation is
not nearly as well put together as the the documentation for
Edward and PyMC3. Also, BUGS is missing the hooks to neural
nets that Edward, pyro, and PyMC3 have.
- Stan: a very
general purpose statistical modeling language, and it
interfaces well with python, as well as lots of other data
analysis languages. Because it is so general purpose, it is
also the most intricate and extensive language. The
documentation is 600+ pages.
For Part I, there is nothing to hand in.
Perform either exact or approximate inference
to obtain answers to part III of Assignment 4. You solved
this inference problem exactly, and the answers should be P(G
1=2|X
2=50)
= 0.1054 and p(X
3=50|X
2=50) =
0.1024. If you're going to use Edward, I wasn't able to
get any of the sampling-based inference procedures
(Metropolis-Hastings, Gibbs, hybrid Monte Carlo) to work on
discrete RVs; however KLpq does seem to get a solution, as long
as you include the argument n_samples=100 or larger. Because
there aren't any good examples of discrete RVs in Edward, we
found
this
implementation of the sprinkler/rain graphical model to
be helpful. Read the description of KLpq carefully: it does a
search over Gaussian RVs, so you need to constrain the variable
if you want it to be nonnegative or binary. We also found that
for estimating p(X
3=50|X
2=50), the
distribution needs to be initialized to be in the right
neighborhood.
If you get really stuck and can't get this example to run,
implement the burglar alarm network from class and show some
inference results. The burglar alarm should be a straightforward
extension of the sprinkler/rain net. We will give a max of 80%
credit for this model.
As I mentioned on Piazza, one student has had success with PyMC3
and the code produced was quite sensible and readable.
For Part II, we would like you to hand in your code, and the
runs that produce the two answers.