submited by

Style Pass

We will phrase the problem in terms of a measurement of error. For example, let’s try to achieve a given square error in a regression problem. That is, how hard it it to estimate a number given a sample from a larger population?

A simplified version of our modeling problems is the following. There is an unobserved population of real values v_{j} where j is an integer ranging from 1 to p. We have access to a sample y_{i} where i is an integer ranging from 1 to m. Each y_{i} was generated as a copy of v_{j} for an independent uniform random draw of j from the integers 1 through p, with re-use or replacement allowed. We want to estimate the average value of the vs from our ys. This is the usual formulation of working out an estimate on training data, and asking how well the estimate will work on future data drawn from the same population.

We are interested in how far off the visible sample estimate sample_estimate := (1/m) sum_{i=1...m} y_{i} is from the unobserved true population mean true_value := (1/p) sum_{j=1...p} v_{j}. We will quantify this as “square error” or (true_value - estimate)^2. Square error is an example of a loss or criticism: smaller is better, and in this case zero represents perfection.

Read more win-vector.c...