large language model

From entry large language model in glossary Artificial Intelligence: humans at the heart of algorithms

Large-language models, as used in ChatGPT, have been one of the defining areas of the 'new AI'. At their simplest they are word predictors, taking a text and attempting to predict the next word that will appear. If this is repeated then whole new texts can be created.

LLMs build on the success of simple statistical methods such as n-grams which, when trained on very large corpera, were found to be 'unreasonably effective' at tasks that had previously been thought to require complex natural language processing. LLMs leverage the same big data, from web documents, media feeds, social media and forums, but use deep neural networks, which appear able to identify higher levels of meaning such as topics.

The addition of attention mechanisms in transformer models has allowed LLMs to make use of long-term patterns in language, such as understanding the referants of pronouns, or returning to previous topics. The text and chat's produced by LLMs can be indistinguishable from those with humans and can thus be argued to be passing the Turing Test.

Also used in hcistats2e: Chap. 1: page 17; Chap. 10: page 120; Chap. 11: page 128; Chap. 12: pages 134, 145

Also known as: LLM

large language model

Terms from Statistics for HCI: Making Sense of Quantitative Data

Links: