# Multinomial distribution

Parameters Probability mass function Cumulative distribution function ${\displaystyle n>0}$ number of trials (integer)${\displaystyle p_{1},\ldots p_{k}}$ event probabilities (${\displaystyle \Sigma p_{i}=1}$) ${\displaystyle X_{i}\in \{0,\dots ,n\}}$${\displaystyle \Sigma X_{i}=n\!}$ ${\displaystyle {\frac {n!}{x_{1}!\cdots x_{k}!}}p_{1}^{x_{1}}\cdots p_{k}^{x_{k}}}$ ${\displaystyle E\{X_{i}\}=np_{i}}$ ${\displaystyle {\mathrm {Var} }(X_{i})=np_{i}(1-p_{i})}$${\displaystyle {\mathrm {Cov} }(X_{i},X_{j})=-np_{i}p_{j}}$ ${\displaystyle \left(\sum _{i=1}^{k}p_{i}e^{t_{i}}\right)^{n}}$

In probability theory, the multinomial distribution is a generalization of the binomial distribution.

The binomial distribution is the probability distribution of the number of "successes" in n independent Bernoulli trials, with the same probability of "success" on each trial. In a multinomial distribution, each trial results in exactly one of some fixed finite number k of possible outcomes, with probabilities p1, ..., pk (so that pi ≥ 0 for i = 1, ..., k and ${\displaystyle \sum _{i=1}^{k}p_{i}=1}$), and there are n independent trials. Then let the random variables ${\displaystyle X_{i}}$ indicate the number of times outcome number i was observed over the n trials. ${\displaystyle X=(X_{1},\ldots ,X_{k})}$ follows a multinomial distribution with parameters n and p.

## Specification

### Probability mass function

The probability mass function of the multinomial distribution is:

{\displaystyle {\begin{aligned}f(x_{1},\ldots ,x_{k};n,p_{1},\ldots ,p_{k})&{}=\Pr(X_{1}=x_{1}{\mbox{ and }}\dots {\mbox{ and }}X_{k}=x_{k})\\\\&{}={\begin{cases}{\displaystyle {n! \over x_{1}!\cdots x_{k}!}p_{1}^{x_{1}}\cdots p_{k}^{x_{k}}},\quad &{\mbox{when }}\sum _{i=1}^{k}x_{i}=n\\\\0&{\mbox{otherwise,}}\end{cases}}\end{aligned}}}

for non-negative integers x1, ..., xk.

## Properties

The expected value is

${\displaystyle \operatorname {E} (X_{i})=np_{i}.}$

The covariance matrix is as follows. Each diagonal entry is the variance of a binomially distributed random variable, and is therefore

${\displaystyle \operatorname {var} (X_{i})=np_{i}(1-p_{i}).}$

The off-diagonal entries are the covariances:

${\displaystyle \operatorname {cov} (X_{i},X_{j})=-np_{i}p_{j}}$

for i, j distinct.

All covariances are negative because for fixed N, an increase in one component of a multinomial vector requires a decrease in another component.

This is a k × k nonnegative-definite matrix of rank k − 1.

The off-diagonal entries of the corresponding correlation matrix are

${\displaystyle \rho (X_{i},X_{j})=-{\sqrt {\frac {p_{i}p_{j}}{(1-p_{i})(1-p_{j})}}}.}$

Note that the sample size drops out of this expression.

Each of the k components separately has a binomial distribution with parameters n and pi, for the appropriate value of the subscript i.

The support of the multinomial distribution is the set :${\displaystyle \{(n_{1},\dots ,n_{k})\in \mathbb {N} ^{k}|n_{1}+\cdots +n_{k}=n\}.}$ Its number of elements is

${\displaystyle {n+k-1 \choose k}=\left\langle {\begin{matrix}n\\k\end{matrix}}\right\rangle ,}$

the number of n-combinations of a multiset with k types, or multiset coefficient.