Temperature in AI usually refers to the level of randomness in an algorithm. The term is used in slightly different ways in different types of algorithm and stages of machine learning, but in general low temperatures leads to more deterministic behaviours whereas high temperatures offer more variation.
In simulated annealing (and other forms of gradient descent algorithms) the temperature starts high and then gradually 'cools'. In the early high temperature phase the algorithm may move the current position in directions that are worse, but at lower probability than those that are better. In the latter stages, at low temperatures it is more likely to follow the direction of maximum gradient descent.
In large language models the temperature is a parameter that determines whether the algorithm always chooses the most likely next word (low temperature) or may sometime choose less likely ones (high temerature). The former leads to predictable and reliable system answers, whereas the latter is more 'creative'.
Links:
pmc.ncbi.nlm.nih.gov: The Temperature Feature of ChatGPT: Modifying Creativity for Clinical Research
community.openai.com: Cheat Sheet: Mastering Temperature and Top_p in ChatGPT API