What is the Softsign function?
The Softsign function is an activation function used in neural networks, defined as \(\phi(x) = \frac{x}{1+|x|}\). Like the hyperbolic tangent (tanh), it is smooth, S-shaped, and bounded to the open interval (-1, 1). The key difference is how it approaches its asymptotes: Softsign approaches +/-1 polynomially (as \(\frac{1}{|x|}\)), while tanh does so exponentially. This slower saturation can help reduce vanishing-gradient effects in some architectures.
How to use this calculator
Enter three values: the Initial value of x (the first row's x), the Increment value (the step added each row), and the Number of repetitions (how many rows to generate). The calculator then produces a table of x, Softsign \(\phi(x)\), and the first derivative \(\phi'(x)\) for every point, which you can use to plot the curves or inspect specific values.
The formula explained
For each row, let \(a = 1 + |x|\). Then $$\phi(x) = \frac{x}{a}, \qquad \phi'(x) = \frac{1}{a^{2}}.$$ Because \(|x|\) is never negative, the denominator \(a\) is always at least 1, so there is never a division by zero and the function is smooth everywhere. The Softsign function is odd (\(\phi(-x) = -\phi(x)\)), while its derivative is even (\(\phi'(-x) = \phi'(x)\)). At the origin \(\phi(0) = 0\) and \(\phi'(0) = 1\).
Worked example
For \(x = -5\): \(a = 1 + 5 = 6\), so $$\phi(-5) = \frac{-5}{6} = -0.8333333, \qquad \phi'(-5) = \frac{1}{36} = 0.0277778.$$ For \(x = 1\): \(a = 2\), so \(\phi(1) = 0.5\) and \(\phi'(1) = 0.25\). For \(x = 0\): \(a = 1\), so \(\phi(0) = 0\) and \(\phi'(0) = 1\). With the defaults (start -5, step 0.1, 101 rows) the table sweeps \(x\) from -5 to +5.
FAQ
Why does the derivative never go negative? Because \(\phi'(x) = \frac{1}{(1+|x|)^{2}}\) is a reciprocal of a square, it is always strictly positive, meaning Softsign is monotonically increasing.
How does Softsign differ from tanh? Both saturate to a bounded range, but Softsign saturates more gradually (rational decay) versus tanh's exponential decay, keeping gradients alive over a wider input range.
Can the step be negative? Yes. A negative step makes the table descend; a zero step repeats the same x value on every row.