10/20/04: For ALL SUBSEQUENT
ASSIGNMENTS, you should hand in any code that is part of the work you
did. Michael Howe cannot grade the homeworks without the
code. I apologize for telling you earlier that you didn't need to
hand in code. You can print the code out with "enscript -2r" or
some other way to get multiple pages of code on a single piece of
paper, so that you save paper.
If you had points taken off on earlier assignments because you didn't
include code, Michael will regrade these if you bring your code and
your graded assignment to office hours.
11/11/04: I suggested initializing your weights to a random value in
the range -.01 to +.01. It turns out that can lead to some poor
local optima. So if your network only learns a little and then
seems to get stuck, you might try a larger learning rate. This
will definitely help for XOR, and may also help for the digits.
11/11/04: To give you a reality check for your neural net, you
shoud be getting at least 70% correct on the test set if all is working
correctly.
11/13/04: If you train your network by presenting the digit
examples in the same order as is in the training file, you may
encounter the following problem: Because the first 250 examples
are all "0", the network will quickly learn to output "0" regardless of
the input, then when you move on to "1", the network will quickly learn
to say "1". The result of this is that for most of the 250
examples of a given digit, the network will appear to have learned when all
that has happened is that it changes the output biases. It will
make the network's error appear to be small when in fact the network
hasn't learned much of anything. To get
around this problem, you should randomize the order of examples each
training epochs. In
my code, I do this by having an array of indices numbered 0 - 2499 ,
and call a function permutto permute the order of
these array elements. Then I loop for i = 0 to 2499and use permuted_array[i] as the index of the next example
I'm going to train on.