Day 17

April 6, 2017

Handwritten characters: feature extraction

Here are your handwritten characters (download and unpack)

Let us write code to extract some features that might be useful for recognition.

%pylab inline
from PIL import Image
import glob
Populating the interactive namespace from numpy and matplotlib
pngs = sorted(glob.glob('pngs/*aforthma*.png'))

for png in pngs:
         print(png)
         img = Image.open(png)
         imshow(img)
         break
handwriting/pngs/054_20170329_aforthma__8.png

Report 3 expectations

Due Friday, April 14 at 11:59pm.

In a nutshell: build and test a classifier for handwritten characters, using a decision tree and the SVM algorithm as formulated and implemented in class on Day 16 for finding separating hyperplanes. You may not use any canned machine-learning package, like sklearn: your code must all be written from scratch except for solving the quadratic programming problem with cvxopt.

Take some "training" subset of the PNG images that you think you might to use to to develop a classifier, and another subset that you will use to test the quality of your classifier.

Unsupervised learning: method of K means

Attempt to identify intrinsic clusters in the training data, without anyone giving us information about which class each point lies in.