Temperature is one of the most important parameters of large language models (LLMs). It directly influences the creativity of generated text. While hi

Is a Zero Temperature Deterministic?

submited by
Style Pass
2024-09-23 02:00:06

Temperature is one of the most important parameters of large language models (LLMs). It directly influences the creativity of generated text.

While higher temperatures encourage more diverse outputs, lower temperatures lead to predictable results. What happens at the extreme end, when the temperature is set to zero? Let’s explore this question.

LLMs generate text by predicting the probability of different tokens appearing in a sequence. These probabilities are represented as logits, which are raw scores assigned to each potential token.

Let’s say we have the following phrase as our starting point: “I am hungry for”. Our goal is to predict the next token. The model outputs the following tokens and logits:

To make the article more readable, I’m using full words. In practice, you would see tokens like “dump” instead of “dumpling,” as they generally average around 4 characters.

To transform these logits into interpretable probabilities, we use the softmax function. This function normalizes the logits, ensuring they sum to 1 and can be interpreted as probabilities:

Leave a Comment