What is the binomial distribution?
The binomial distribution models the number of successes x in a fixed number of independent trials n, where each trial succeeds with the same probability p (a Bernoulli trial). It answers questions like "what is the chance of getting exactly 5 heads in 20 coin tosses?" This is pure mathematics and applies identically everywhere, with no units or jurisdiction.
How to use this calculator
Pick which function to compute: the probability mass \(f(x)\) (the chance of exactly x successes), the lower cumulative \(P(X \le x)\), or the upper cumulative \(Q(X \ge x)\). Enter the number of trials n, the per-trial success probability p (between 0 and 1), then choose the first success count (initial x), the step between rows, and how many rows to generate. The tool tabulates and graphs the selected function as a discrete histogram with bars touching.
The formula explained
The probability mass function is $$f(x,n,p) = \binom{n}{x}\, p^{\,x}\,(1-p)^{\,n-x}$$ where \(\binom{n}{x} = \dfrac{n!}{x!(n-x)!}\) is the binomial coefficient. The lower cumulative \(P(x)\) sums f over \(t = 0..x\), and the upper cumulative \(Q(x)\) sums f over \(t = x..n\). To avoid factorial overflow for large n, this calculator computes the coefficient with the log-gamma function: $$\ln f = \ln\Gamma(n+1) - \ln\Gamma(x+1) - \ln\Gamma(n-x+1) + x\cdot\ln p + (n-x)\cdot\ln(1-p)$$ The distribution mean is \(np\) and the variance is \(np(1-p)\).
Worked example
For \(n = 20\), \(p = 0.25\), evaluating the PMF at \(x = 0..12\): \(f(0) \approx 0.003171\), \(f(1) \approx 0.021142\), \(f(2) \approx 0.066948\), \(f(3) \approx 0.133897\), \(f(4) \approx 0.189691\), and \(f(5) \approx 0.202337\). The peak occurs at \(x = 5\), which equals the mean $$np = 20 \times 0.25 = 5$$ exactly as expected.
Definitions & Glossary
- Trial: A single repetition of a random experiment with a fixed, defined set of outcomes.
- Bernoulli trial: A trial with exactly two mutually exclusive outcomes, conventionally labeled "success" and "failure."
- Success probability \(p\): The probability that a single trial results in a success, with \(0 \le p \le 1\). It is assumed constant across all trials.
- Number of trials \(n\): The fixed count of independent Bernoulli trials in the experiment, a non-negative integer.
- Successes \(x\): The observed number of successes among the \(n\) trials; \(x\) is an integer with \(0 \le x \le n\).
- PMF \(f(x)\): The probability mass function, giving the probability of exactly \(x\) successes: \(f(x)=\binom{n}{x}p^{x}(1-p)^{n-x}\).
- Lower cumulative \(P(X\le x)\): The cumulative distribution function, the probability of at most \(x\) successes: \(P(X\le x)=\sum_{k=0}^{x} f(k)\).
- Upper cumulative \(Q(X\ge x)\): The probability of at least \(x\) successes: \(Q(X\ge x)=\sum_{k=x}^{n} f(k)=1-P(X\le x-1)\).
- Binomial coefficient \(\binom{n}{x}\): The number of distinct ways to choose \(x\) successes from \(n\) trials, \(\binom{n}{x}=\dfrac{n!}{x!\,(n-x)!}\).
- Mean \(np\): The expected number of successes, \(\mu = np\).
- Variance \(np(1-p)\): The variance of the count of successes, \(\sigma^{2}=np(1-p)\); the standard deviation is \(\sigma=\sqrt{np(1-p)}\).
Interpreting Your Result
The three quantities answer three different questions about the same experiment:
- \(f(x)\) — exactly \(x\): the probability of obtaining precisely \(x\) successes and no other number. Use this for "exactly k" questions.
- \(P(X\le x)\) — at most \(x\): the probability that the number of successes does not exceed \(x\). Use this for "at most k," "no more than k," or "fewer than k+1" questions.
- \(Q(X\ge x)\) — at least \(x\): the probability of \(x\) or more successes. Use this for "at least k," "k or more," or "more than k−1" questions.
Mapping a real question to a function. Translate the wording carefully, watching the boundary:
- "At least \(k\)" \(\Rightarrow Q(X\ge k)\).
- "More than \(k\)" \(\Rightarrow Q(X\ge k+1) = 1 - P(X\le k)\).
- "At most \(k\)" \(\Rightarrow P(X\le k)\).
- "Fewer than \(k\)" \(\Rightarrow P(X\le k-1)\).
- "Between \(a\) and \(b\) inclusive" \(\Rightarrow P(X\le b) - P(X\le a-1)\).
The \(P\)/\(Q\) overlap. Because both \(P(X\le x)\) and \(Q(X\ge x)\) include the term \(f(x)\), they are not complementary at the same \(x\). In fact \(P(X\le x) + Q(X\ge x) = 1 + f(x)\), so the two cumulative tails overlap by exactly one point mass. The true complement of \(Q(X\ge x)\) is \(P(X\le x-1)\), not \(P(X\le x)\).
Normal approximation. When both \(np\) and \(n(1-p)\) are reasonably large (a common rule of thumb is each \(\ge 5\), and ideally \(\ge 10\)), the binomial is well approximated by a normal distribution with mean \(\mu = np\) and standard deviation \(\sigma = \sqrt{np(1-p)}\). Apply a continuity correction (e.g., use \(x+0.5\) or \(x-0.5\)) when converting a discrete count to the continuous normal scale. For large \(n\) with small \(p\) (so that \(np\) stays moderate), the Poisson distribution with \(\lambda = np\) is the more accurate approximation.
FAQ
Why does P(x) + Q(x) not equal 1? Both cumulatives include the point \(t = x\), so \(P(x) + Q(x) = 1 + f(x)\). This overlap convention (lower includes x, upper includes x) is used here intentionally.
What if x is outside 0..n? The PMF is 0 there; the lower cumulative clamps to 0 (\(x < 0\)) or 1 (\(x \ge n\)), and the upper clamps to 1 (\(x \le 0\)) or 0 (\(x > n\)).
Can I use large n? Yes. Log-gamma computation keeps the result stable for large n where direct factorials would overflow.