n-gram

Terms from Artificial Intelligence: humans at the heart of algorithms

Page numbers are for draft copy at present; they will be replaced with correct numbers when final book is formatted. Chapter numbers are correct and will not change now.

In a text or sequence an n-gram is a sequence of n consecutive tokens. In the case of text treated as a sequence words, a 1-gram is a single word in the text and a 2-gram is a word pair. For example, in this descriotion, "text or sequence a" is a 4-gram. Typically you are interested in the set of all n-grams in a text for some n and their respective frequencies of occurrence in the text.

Used on Chap. 8: page 165; Chap. 13: pages 312, 313; Chap. 14: page 328; Chap. 17: page 410