Tuesday, April 26, 2017

https://docs.google.com/document/d/1ROIGH32bXHTVWBWF2v17ur4t7D32LA4tfhDTQZ8y6U8

(change first X to Z)

Each group choose one, develop code to answer it, present to class.

Enter useful things you learn in this google doc:

https://docs.google.com/document/d/11UvTNvxmPpHuL1jubcK-owFrIGybk_rJSBE-rrAQJdo

(Change first x to a z)

A brute force approach to machine learning. Let's revisit our handwritten character recognition, and abandon the pre-processing step of feature extraction.

How about computing the average of all the "8"s, of all the "4"s, etc.? Let's see what we get.

If we form the dot product of the average of the "4"s with a single image, we might expect that dot product to be large if the image is a 4 and smaller if it isn't a 4. Let's take a look.

from glob import glob from PIL import Image from numpy import * import matplotlib.pyplot as pl import os imagefolder = 'day22images/' if not os.path.exists(imagefolder): os.makedirs(imagefolder) allpngs = glob('pngs/*.png') #pngs[:10] def getsig(png): return png[-6:-4] sigs = list(set([getsig(png) for png in allpngs])) sigs sampleimg = Image.open(allpngs[0]) print( array(sampleimg).shape ) h,w,nc = array(sampleimg).shape ad = {} for sig in sigs: pngs = [png for png in allpngs if getsig(png)==sig] print(sig,len(pngs)) avg = zeros((h,w),dtype=float) for png in pngs: a = 255-array(Image.open(png))[:,:,0] # select just the red channel (and invert) avg += a avg -= avg.mean() pl.imsave(imagefolder+'average_'+sig+'.png',avg,vmin=avg.min(),vmax=avg.max(),cmap='seismic') ad[sig] = avg ad['_0'].max()

(125, 100, 4) _0 230 _7 230 09 219 _1 230 _9 230 _o 230 _r 231 _2 230 _m 230 _8 230 20887.536079999998

The shifted averages of each character class:

We could try recognition of the character represented in a given image by forming the "dot products" of that image with each of the above shifted averages and saying the one that gives the largest value is our character prediction.

for sig in sigs: pngs = [png for png in allpngs if getsig(png)==sig] #print(sig,len(pngs)) preds = [] for png in pngs: a = 255-array(Image.open(png))[:,:,0] # select just the red channel (and invert) dps = [] for sig2 in sigs: dps.append((a*ad[sig2]).sum()) pred = argmax(dps) preds.append(pred) ncorrect = sum([sigs[pred]==sig for pred in preds]) print('{:2} {:4d} of {:3d}, {:2.1f}% correct'.format(sig,ncorrect,len(pngs), ncorrect/len(pngs)*100))

_0 127 of 230, 55.2% correct _7 75 of 230, 32.6% correct 09 182 of 219, 83.1% correct _1 175 of 230, 76.1% correct _9 172 of 230, 74.8% correct _o 181 of 230, 78.7% correct _r 109 of 231, 47.2% correct _2 81 of 230, 35.2% correct _m 224 of 230, 97.4% correct _8 225 of 230, 97.8% correct

Not bad, but can we do better? If we call the shifted average images "weights", can we find modified weights that will give us better accuracy in recognition? How? Perhaps we could define a scalar measure of success and then try to maximize it by following the gradient of that measure with respect to the 10x125x100 = 125000 variables under our control (the pixel values of the set of all the weights images).

We can implement this "gradient ascent" in Tensorflow.

Download the Jupyter notebook supplied with this excellent video tutorial by Magnus Erik Hvass Pedersen.

**Quiz: "hvass01"** As we watch and discuss the video, each student should ask at least one question
via the quiz form.