Neural Networks and Deep Learning

Fall 2017

Due Sep 12

The goal of this assignment is to introduce
neural networks in terms of ideas you are already familiar with:
linear regression and linear-threshold classification.

Consider the following table that describes a
relationship between two input variables (*x*_{1},*x*_{2})
and an output variable (*y*).

This is part of a larger data set that I created which you can download either in matlab or text format. Using your favorite language, find the least squares solution to*y* = *w*_{1}
* *x*_{1} + *w*_{2} * *x*_{2}
+ *b*.

(1a) Report the values of*w*_{1}, *w*_{2},
and *b*.

(1b) What function or method did you use to find the least-squares solution?

x |
x |
y |

.1227 |
.2990 |
+0.1825 |

.3914 |
.6392 |
+0.8882 |

.7725 |
.0826 |
-1.9521 |

.8342 |
.0823 |
-1.9328 |

.5084 |
.8025 |
+1.2246 |

.9983 |
.7404 |
-0.0631 |

This is part of a larger data set that I created which you can download either in matlab or text format. Using your favorite language, find the least squares solution to

(1a) Report the values of

(1b) What function or method did you use to find the least-squares solution?

Using the LMS algorithm, write a program that
determines the coefficients {w_{1},w_{2},b} via
incremental updating, steepest descent, and multiple passes
through the training data. You will need to experiment with
updating rules (online, batch, minibatch), step sizes (i.e.,
learning rates), stopping criteria, etc. Experiment to find
settings that lead to solutions with the fewest number of sweeps
through the data.

(2a) Report the values of*w*_{1}, *w*_{2},
and *b*.

(2b) What settings worked well for you: online vs. batch vs. minibatch? what step size? how did you decide to terminate?

(2c) Make a graph of error on the entire data set as a function of epoch. An epoch is a complete sweep through all the data.

(2a) Report the values of

(2b) What settings worked well for you: online vs. batch vs. minibatch? what step size? how did you decide to terminate?

(2c) Make a graph of error on the entire data set as a function of epoch. An epoch is a complete sweep through all the data.

Turn this data set from a regression problem
into a classification problem simply by using the sign of *y*
(+ or -) as representing one of two classes. In the data set you
download, you'll see a variable *z* that represents this
binary (0 or 1) class. Use the perceptron learning rule to
solve for the coefficients {*w*_{1}, *w*_{2},
*b*} of this classification problem.

Two warnings: First, your solution to Part 3 should require only a few lines of code added to the code you wrote for Part 2. Second, the Perceptron algorithm will not converge if there is no exact solution to the training data. It will jitter among coefficients that all yield roughly equally good solutions.

(3a) Report the values of coefficients*w*_{1},
*w*_{2}, and *b*.

(3b) Make a graph of the accuracy (% correct classification) on the training set as a function of epoch.

Two warnings: First, your solution to Part 3 should require only a few lines of code added to the code you wrote for Part 2. Second, the Perceptron algorithm will not converge if there is no exact solution to the training data. It will jitter among coefficients that all yield roughly equally good solutions.

(3a) Report the values of coefficients

(3b) Make a graph of the accuracy (% correct classification) on the training set as a function of epoch.

In machine learning, we really want to train
a model based on some data and then expect the model to do well
on "out of sample" data. Try this with the code you wrote for
Part 3: Train the model on the first {5, 10, 25, 50, 75}
examples in the data set and test the model on the final 25
examples.

(4a) How does performance on the test set vary with the amount of training data? Make a bar graph showing performance for each of the different training set sizes.

(4a) How does performance on the test set vary with the amount of training data? Make a bar graph showing performance for each of the different training set sizes.