OCR (optical character recognition)

Terms from Artificial Intelligence: humans at the heart of algorithms

The glossary is being gradually proof checked, but currently has many typos and misspellings.

OCR (optical character recognition) is usually the first computer process to understand written or printed text. Traditional methods involve matching against known patterns for letters, or first looking for features such as loops and lines. Neural networks are now used expecially for hand-written characters where ther is more variation.
A problem for both traditional approaches and neural networks is matching characters at different orientations. Some systems try to work out the general direction of text first and rotate the image whilst others rotate individual characters to match templates (called normalization). Some neural networks are orientation-independent, by having layers which effectively perform this normalization.
Usually the result of OCR is uncertain, both because of inherent ambiguities (is it zero or capital O, a small or large letter O, a 6 or a 9 upside down, VV or W?) and because the text may be poorly written or handwriting unusual. This is resolved at higher levels of processing, for example when characters are grouped into word units they can be checked against a dictionary, allowing disambiguation of the lower level. Typically some of this higher level processing is carried out within OCR resulting in plain text, but in some applications more raw confidence measuers are useful to allow higher-level disambiguation, or highlighting of regions for hand checking.

Used in Chap. 10: pages 140, 142; Chap. 12: page 162; Chap. 17: page 264

Also known as: optical character recognition

Used in glossary entries: disambiguation, normalisation