Assignment 1
Neural Networks and Deep Learning
CSCI 5922
Fall 2017
Assigned Aug 31
Due Sep 12
Assignment submission
Goal
The goal of this assignment is to introduce
neural networks in terms of ideas you are already familiar with:
linear regression and linear-threshold classification.
Part 1
Consider the following table that describes a
relationship between two input variables (
x1,
x2)
and an output variable (
y).
x1
|
x2
|
y
|
.1227
|
.2990
|
+0.1825
|
.3914
|
.6392
|
+0.8882
|
.7725
|
.0826
|
-1.9521
|
.8342
|
.0823
|
-1.9328
|
.5084
|
.8025
|
+1.2246
|
.9983
|
.7404
|
-0.0631
|
This is part of a larger data set that I created which you can
download either in
matlab or
text format. Using your favorite
language, find the least squares solution to
y =
w1
*
x1 +
w2 *
x2
+
b.
(1a) Report the values of
w1,
w2,
and
b.
(1b) What function or method did you use to find the
least-squares solution?
Part 2
Using the LMS algorithm, write a program that
determines the coefficients {w1,w2,b} via
incremental updating, steepest descent, and multiple passes
through the training data. You will need to experiment with
updating rules (online, batch, minibatch), step sizes (i.e.,
learning rates), stopping criteria, etc. Experiment to find
settings that lead to solutions with the fewest number of sweeps
through the data.
(2a) Report the values of w1, w2,
and b.
(2b) What settings worked well for you: online vs. batch
vs. minibatch? what step size? how did you decide to terminate?
(2c) Make a graph of error on the entire data set as a function
of epoch. An epoch is a complete sweep through all the data.
Part 3
Turn this data set from a regression problem
into a classification problem simply by using the sign of y
(+ or -) as representing one of two classes. In the data set you
download, you'll see a variable z that represents this
binary (0 or 1) class. Use the perceptron learning rule to
solve for the coefficients {w1, w2,
b} of this classification problem.
Two warnings: First, your solution to Part 3 should require only
a few lines of code added to the code you wrote for Part 2.
Second, the Perceptron algorithm will not converge if there is
no exact solution to the training data. It will jitter among
coefficients that all yield roughly equally good solutions.
(3a) Report the values of coefficients w1,
w2, and b.
(3b) Make a graph of the accuracy (% correct classification) on
the training set as a function of epoch.
Part 4
In machine learning, we really want to train
a model based on some data and then expect the model to do well
on "out of sample" data. Try this with the code you wrote for
Part 3: Train the model on the first {5, 10, 25, 50, 75}
examples in the data set and test the model on the final 25
examples.
(4a) How does performance on the test set vary with the amount
of training data? Make a bar graph showing performance for each
of the different training set sizes.