OCR (optical character recognition)

Terms from Artificial Intelligence: humans at the heart of algorithms

OCR (optical character recognition) is usually the first computer process to understand written or printed text. Traditional methods involve matching against known patterns for letters, or first lookoing for features such as loops and lines. Neural networks are now used expecially for hand-written characters where ther is more variation. A problem for both is matching chracters at different orientations, so some systems try to work out the general direction of text first and rotate the image, some rotate individual characters to match templates (called normalization). Some neural networks are oreintation-indpeendent, by having layers which effectively perform this normalization. Usuually the result of OCR is uncertain, both because of inherent ambiguities (is it zero or capital O, a small or large letter O, a 6 or a 9 upside down, VV or W?) and because the text may be poorly written or handwriting unusual. This is resolved at higher levels of processing, for example when characters are grouped into word units they can be checked against a dictionary, allowing disambiguation of the lower level. Typically some of this higher level processing is carried but within OCR resulting in plain text, but in some applications more raw confidence measuers are useful to allow yet higher-level dismbiguation, or highlighting of regions for hand checking.

Used on pages 207, 408

Also known as optical character recognition