Pointwise errors *e*_{i} = *α**x*_{i} + *β* − *y*_{i}

Some freedom in defining overall error E in terms of pointwise errors {*e*_{i}}

We found a formula the line that minimizes the choice *E* = ∑^{n}_{i = 1}*e*^{2}_{i}

argmin_{(α, β) ∈ ℝ2} *E*

How to find best (*α*, *β*) for other choices?

only for *E*_{2}

Random choices of (*α*, *β*)

My conjecture: a line containing 2 of the data points is in argmin *E* when
we choose *E* = ∑^{n}_{i = 1}|*e*_{i}|. (If true, this narrows down the possibilities
to a finite set.)

E1 is piecewise-linear in alpha,beta:

and discontinuities in slope occur where the line contains a data point. Minimizer must include a point at which the piecewise linear graph has a vertex. There are only n(n-1)/2 of those.

Note: function evaluations are often expensive

Useful metaphor: you are trying to find the deepest point in a muddy lake, using only a boat and a plumb line.

Another one: you are trying to find the oven parameters that will produce the perfect cake.

After class, I added some lines to the plot to show the sequence of trials more fully:

A comparative study of several definitions of badness of fit of a linear function to data
{(*x*_{i}, *y*_{i}) ∈ ℝ^{2} : *i* ∈ {1, 2, ..., *n*}}

Looking for best-fit line according to several definitions of error for a variety of data sets (large/small, regular/irregular), and your own justified personal value judgments based on the examples you present.

It seems likely that the various definitions of overall error will disagree most in their treatment of outliers. So some of your test cases should have data that is quite linear except for a small number of outliers, like this:

Another minimization problem

Spreadsheet for data entry except change the S to an R.