Connect via MCP →

Enter Calculation

Enter K real numbers, separated by commas, spaces or new lines.

Formula

Advertisement

Results

Softmax σ(z)
0.0900305732, 0.2447284711, 0.6652409558
probability distribution (sums to 1)
Vector length K 3
Sum of outputs 1
Argmax (1-based index) 3
Max probability 0.6652409558

What is the softmax function?

The softmax function takes a vector of K real numbers and turns it into a probability distribution: every output lies strictly between 0 and 1, and all K outputs add up to exactly 1. It is the standard activation function in the output layer of neural-network classifiers, where it converts raw model scores (logits) into class probabilities. Because it is dimensionless, the inputs are pure numbers with no units.

Flat diagram showing a vector of three real numbers transformed into three probability bars summing to one
Softmax maps an input vector of real numbers to a probability distribution that sums to 1.

How to use this calculator

Type your input vector into the box as a list of numbers separated by commas, spaces, or new lines (for example 1, 2, 3). The numbers may be positive, negative, zero, or fractional. Press calculate and you will get the softmax probability for each component, the sum of the outputs (which should equal 1), and the argmax — the 1-based index of the largest probability.

The formula explained

For each component j the softmax is $$\sigma(z)_j = \frac{e^{z_j}}{\displaystyle\sum_{k} e^{z_k}}.$$ Exponentiating makes every term positive, and dividing by the total normalizes them so they sum to 1. For numerical stability this calculator subtracts the maximum value \(m\) from every element before exponentiating: $$\sigma(z)_j = \frac{e^{z_j - m}}{\displaystyle\sum_{k} e^{z_k - m}}.$$ The common factor \(e^{-m}\) cancels, giving an identical result while preventing overflow on large inputs.

Advertisement
Flat diagram of the softmax formula structure with exponentials over a sum of exponentials
Each output is the exponential of one element divided by the sum of exponentials of all elements.

Worked example

For \(z = (1, 2, 3)\): $$e^{1} = 2.71828, \quad e^{2} = 7.38906, \quad e^{3} = 20.08554,$$ summing to \(30.19287\). Dividing each gives $$\sigma = (0.09003,\ 0.24473,\ 0.66524),$$ which sum to 1. The argmax is index 3, the largest input, with probability \(0.66524\).

FAQ

Why do the outputs always sum to 1? Because each exponential is divided by the sum of all exponentials, the normalization guarantees a total of 1.

What if all inputs are equal? The result is a uniform distribution where every output equals \(1/K\).

Does adding a constant to every input change the result? No. Softmax is shift-invariant: adding the same constant \(c\) to all inputs leaves the output unchanged, which is exactly why subtracting the max is safe.

Last updated: