least squares

The glossary is being gradually proof checked, but currently has many typos and misspellings.

Many machine learning and statistical algorithms can be formulated as least squares procedures. They are effectively minimising the sum of the squares of the difference between the result they deliver and the desired result – called the mean square error. Examples include linear regression and backpropagation for neural networks. Similar methods can try to minimise the sum of the absolute errors, or the maximum error, but often squares are easier to work with mathematically.

In some case the sue of least squares is explicit in the way the algorithm is forumulated, for example the backpropagation algorithm is a form of gradient descent that calculates the changes needed to each weight to make the best reduction in the mean square. Similarly, linear regression minimises the sum of squares of the residuals (that is the difference between the fitted regression line and the actual data points).

Used in Chap. 7: pages 89, 96; Chap. 8: page 107; Chap. 10: page 138

Also known as least squares estimate