Neural Networks and Deep Learning

Spring 2017

Due: Thu Oct 12, 2017

The goal of this assignment is to use
tensorflow to build some neural networks, and to experiment with
the options and flexibility that tensorflow offers.

For Part 1 of the assignment, you will use
the room occupancy data set that you used for Assignment 2.

For Part 2 of the assignment, you will use the data set available here from me. This data set consists of 104 relationships described by the Hinton Family Trees network.**
Here is some
very emba****rrassing
python code I wrote to
read in the data set and build
an input/output representation
that can be fed to tensor****flow.**

For Part 2 of the assignment, you will use the data set available here from me. This data set consists of 104 relationships described by the Hinton Family Trees network.

Use tensorflow to build a feedforward neural
net to predict occupancy. This net should do exactly the same
thing that your code in Part 2 of Assignment 2 does.

(1a) Run a simulation using tensorflow that is identical to Assignment 2, part 2f, in which you vary the number of hidden units and make a plot. Superimpose the plot you made from Assignment 2, part 2f.

(1b) DIscuss the results: are they the same as with your own code? If one works better than the other, explain why you think that is.

(1c) Add a second hidden layer, and train a few architectures with 2 hidden layers. Report what architectures you tried (expressed as 5-*h1-h2-1*, i.e., 5 input, *h1* hidden
in first layer, *h2 *hidden in second layer, and one
output unit), and which ones, if any, outperform your
single-hidden-layer network.

It will not be a big deal to add the second hidden layer once you have all the rest of your code in place.

(1a) Run a simulation using tensorflow that is identical to Assignment 2, part 2f, in which you vary the number of hidden units and make a plot. Superimpose the plot you made from Assignment 2, part 2f.

(1b) DIscuss the results: are they the same as with your own code? If one works better than the other, explain why you think that is.

(1c) Add a second hidden layer, and train a few architectures with 2 hidden layers. Report what architectures you tried (expressed as 5-

It will not be a big deal to add the second hidden layer once you have all the rest of your code in place.

Replicate the Hinton Family Trees
architecture in tensorflow. My notes describe the
architecture and number of neurons in each layer. You can also
refer to the original
paper.

(2a) Randomly split the data set into 89 examples for training and 15 for testing. Train and evaluate 20 such random splits

of the data and report the mean and standard deviation of the test set accuracy. (Report accuracy not squared error. A response should be counted as correct if the most active unit is the target person2.)

(2b) Train a network on all 104 examples and examine the weights from the one-hot person1 input representation to the distributed person1 representation. Figure out a sensible way to graphically display the weights in the 6 hidden units.

(2c) For at least 2 of the hidden units, interpret what the network has learned in its mapping from inputs. You'll have to refer to my create_dataset.py code to determine the interpretation of the person1 input neurons. (I tried to keep the same ordering as Hinton uses.)

(2a) Randomly split the data set into 89 examples for training and 15 for testing. Train and evaluate 20 such random splits

of the data and report the mean and standard deviation of the test set accuracy. (Report accuracy not squared error. A response should be counted as correct if the most active unit is the target person2.)

(2b) Train a network on all 104 examples and examine the weights from the one-hot person1 input representation to the distributed person1 representation. Figure out a sensible way to graphically display the weights in the 6 hidden units.

(2c) For at least 2 of the hidden units, interpret what the network has learned in its mapping from inputs. You'll have to refer to my create_dataset.py code to determine the interpretation of the person1 input neurons. (I tried to keep the same ordering as Hinton uses.)

Compare the achitecture Hinton describes for
Family trees with a generic feedforward architecture with one
hidden layer consisting of 12 neurons.

(3a) Conduct an experiment like the one in (2a) using the generic architecture. Report the mean and standard deviation of the test set accuracy.

(3b) Do you see any difference in performance between the structured net (2a) and the generic net (3a)?

(3c) It wasn't difficult to interpret at least some of the hidden units in the structured net. Can you interpret what any of the hidden units are doing for the generic net? Explain.

(3a) Conduct an experiment like the one in (2a) using the generic architecture. Report the mean and standard deviation of the test set accuracy.

(3b) Do you see any difference in performance between the structured net (2a) and the generic net (3a)?

(3c) It wasn't difficult to interpret at least some of the hidden units in the structured net. Can you interpret what any of the hidden units are doing for the generic net? Explain.