Fall 2019

# Day 25 = Day -4

Wednesday, November 20

# Nice videos on neural networks from 3Blue1Brown

Ch 1 What *is* a neural network?

Ch 3 Back Propagation

# Training our network

Last time, we saw that a single small step along the negative gradient at WT=0 takes us to a set of weights resembling the class-averages of the images, which performs quite well: 90+% accuracy.

Exercise 1: Let's see what happens if we try to make it better by more steps of gradient descent.

Starting point for today's coding

Exercise 2: Make appropriate plots of performance changes during training.

Specifically: make plots of loss and accuracy on training and test images vs step (epoch) number

Build confusion matrix?

# over-fitting

The loss on our training set can continue to get smaller without actual improvement in performance. What's happening and how can we fix it?

## regularization

example: L2 penalty on W

## drop-out

ignore random selections of nodes during learning

Can be more efficient to use just a subset of the images for each step.

epoch = one cycle through all the mini-batches

# Tensorflow

Automation of network construction and computation of gradient.

conda install tensorflow