Use Ctrl+P (or Cmd+P) to save as PDF. Back to paper
Temperature does one thing: it scales logits before softmax.
At T=1.0, the distribution is unchanged. At T<1.0, it sharpens — the top token wins more decisively. At T>1.0, it flattens — lower-probability tokens get more of the mass.
That is the whole mechanism.
Temperature does not add new ideas. It does not unlock tokens the model never assigned probability to. It cannot conjure knowledge that is not in the weights.
Higher temperature only lets sampling venture further down the model's existing probability ranking. If the model never considered a particular token likely, temperature cannot surface it. The distribution is reshaped, not expanded.
Calling temperature a "creativity slider" is a category error.
Most production systems should run T=0.0 to T=1.0. Empirical research on reasoning and multiple-choice tasks finds no performance improvement beyond T=1.0 — only degradation. For reproducible, deterministic outputs, T=0.0 (greedy sampling) is the correct default.
Higher temperatures are appropriate for brainstorming or diversity-sampling use cases — but that is a deliberate trade of coherence for variance, not a creativity enhancement.
Genuine novelty in LLM output comes from:
Temperature at T=1.5 producing surprising text is not the model being creative. It is the model being incoherent. The surprise is noise.
Set temperature for the task, not for the feeling.
Coding, extraction, classification: T=0.0–0.3. Open-ended generation: T=0.7–1.0. Above 1.0: justify it explicitly or do not use it.
Temperature is a precision instrument. Treat it like one.
1. Perez Becker, M., et al. (2024). Is Temperature the Creativity Parameter of Large Language Models? arXiv:2405.00492. https://arxiv.org/abs/2405.00492
2. ACL Anthology (2024). Empirical Study of Temperature in LLM Sampling. EMNLP 2024 Findings. https://aclanthology.org/2024.findings-emnlp.432.pdf
3. Brendoerfer, M. (2024). Decoding Temperature: Controlling Randomness in Language Model Generation. https://mbrenndoerfer.com/writing/decoding-temperature-language-model-generation
4. Engineers of AI. Sampling Strategies: Temperature, Top-K, Top-P. https://engineersofai.com/docs/llms/llm-inference/Sampling-Strategies-Temperature-TopK-TopP