Probability Space

A probability space consists of three components  Ω,F,P\ \Omega, \mathcal{F}, P.

  • The component  Ω\ \Omega is the sample space.

    Each element  ωΩ\ \omega \in \Omega is called an outcome.

  • The component  F\ \mathcal{F} is a set of subsets of  Ω\ \Omega and satisfies the following conditions,

    {ΩF,i ⁣f AF, then ΩAF,i ⁣f A1,A2,F, then iZ+AiF.\left\{ \begin{array}{l} \Omega \in \mathcal{F}\,, \\ i\!f\ A \in \mathcal{F}, \ then\ \Omega - A \in \mathcal{F}\,, \\ i\!f\ A_1, A_2, \cdots \in \mathcal{F}, \ then\ \bigcup\limits_{i \in Z^+} A_i \in \mathcal{F}\,. \end{array} \right.

  • The component P is a measure on  F\ \mathcal{F}, satisfying

    {P(A)0, AF,P(Ω)=1,i ⁣f A1,,Ak,F, AiAj=, then P(kAk)=kP(Ak).\left\{ \begin{array}{l} P(A) \geqslant 0, \ \forall A \in \mathcal{F}\,, \\ P(\Omega) = 1\,, \\ i\!f\ A_1, \cdots, A_k, \cdots \in \mathcal{F}, \ A_i \cap A_j = \emptyset, \ then\ P\Big(\bigcup\limits_k A_k\Big) = \sum\limits_k P(A_k)\,. \end{array} \right.

Examples
  1.  Ω={1,2,,N}\ \Omega = \{1, 2, \cdots, N\},  F={all subsets of Ω}\ \mathcal{F} = \{all\ subsets\ of\ \Omega\}

     

     P(A)=1N(cardinality of A)\ P(A) = \displaystyle\frac{1}{N}(cardinality\ of\ A)

  2.  Ω\ \Omega is the interior of the circle with center  (0,0)\ (0,0) and radius  1\ 1,

     F={all measurable subsets of Ω}\ \mathcal{F} = \{all\ measurable\ subsets\ of\ \Omega\}

     

     P(A)=measure of Aπ\ P(A) = \displaystyle\frac{measure\ of \ A}{\pi}

Random Variable

Let a probabilty space  (Ω,F,P)\ (\Omega, \mathcal{F}, P) be given.

  • A random variable is a function  X\ X from  Ω\ \Omega to the real axis  R\ \mathbb{R} that satisfies  {ωΩ,X(ω)C}F\ \{ \omega \in \Omega, X(\omega) \leqslant C \} \in \mathcal{F}.

  • The cumulative distribution function(CDF) is defined by  F(x)=P(Xx)\ F(x) = P(X \leqslant x)

    It satisfies the following properties:

    1.  F\ F is nondecreasing

    2.  F(x)1 (x+)\ F(x) \to 1 \ (x \to +\infty)

       F(x)0 (x)\ F(x) \to 0 \ (x \to -\infty)

    3.  F(x)\ F(x) is right continuous

Discrete Random Variables

An integer-valued random variables called a discrete random variable.

Its cumulative distribution function is

F(x)=P(Xx)=kxp(k).F(x) = P(X \leqslant x) = \sum\limits_{k \leqslant x} p(k)\,.

where  xR\ x \in \mathbb{R},  p(k)=P(X=k).\ p(k) = P(X = k)\,.

 

 {p(k)}\ \{p(k)\} is called the probability mass function(PMF). It is clear that  P(k)0\ P(k) \geqslant 0,  kp(k)=1\ \sum\limits_{k} p(k) = 1.

Its expectation(mean) and variance

 E(X)=kkp(k),Var(X)=k(kμ)2p(k), μ=E(x).\ E(X) = \displaystyle\sum\limits_k k p(k)\,,\quad Var(X) = \displaystyle\sum\limits_k (k-\mu)^2 p(k),\ \mu = E(x)\,.

Examples

Bernoulli random variable

p(k)={n!n!(nk)!(1r)nkrkk=0,1,,n0otherwisep(k) = \begin{dcases} \frac{n!}{n!(n-k)!}(1-r)^{n-k}r^k & k = 0, 1, \cdots, n \\ 0 & otherwise \end{dcases}

 

E(X)=nr,Var(X)=nr(1r).E(X) = nr\,,\quad Var(X) = nr(1-r)\,.

Poisson random variable

p(k)={λkk!eλk00otherwisep(k) = \begin{dcases} \frac{\lambda^k}{k!}e^{-\lambda} & k \geqslant 0 \\ 0 & otherwise \end{dcases}

 

E(X)=λ,Var(X)=λ.E(X) = \lambda \,,\quad Var(X) = \lambda \,.

Continuous Random Variables

A real-valued random variable is called a continuous random variable.

  • Its cumulative distribution function(CDF) is  F(x)=P(Xx)\ F(x) = P(X \leqslant x)

  • If there exists a non-negative integrable function  p(x)\ p(x) such that  F(x)=xp(t)dt\ F(x) = \displaystyle\int_{-\infty}^x p(t)\,dt, then  p(x)\ p(x) is called the probability density function(PDF)

    If  p(x)\ p(x) is continuous then  dpdx=p(x)\ \displaystyle\frac{dp}{dx}=p(x).

  •  p(x)\ p(x) satisfies

    •  p(x)0\ p(x)\geqslant 0
    •  Rp(x)dx=1, P(x1<Xx2)=x1x2 ⁣p(t)dt.\ \displaystyle\int_{\mathbb{R}}p(x)\,dx = 1\,,\ P(x_1<X\leqslant x_2) = \int_{x_1}^{x_2}\!p(t)\,dt\,.
  • Its expectation(mean)  E(X)=Rxp(x)dx\ E(X) = \displaystyle\int_{\mathbb{R}}xp(x)\,dx

    variance  Var(X)=R(xμ)2p(x)dx,(μ=E(X))\ Var(X) = \displaystyle\int_{\mathbb{R}}(x-\mu)^2p(x)\,dx\,,\quad (\mu = E(X))

Examples

Uniform random variable

p(x)={1baaxb0otherwisep(x) = \begin{dcases} \frac{1}{b-a} & a\leqslant x \leqslant b \\ 0 & otherwise \end{dcases}

E(X)=a+b2,Var(X)=112(ba)2.E(X) = \frac{a+b}{2}\,,\quad Var(X) = \frac{1}{12}(b-a)^2\,.

Gaussian random variable

p(x)=12πσexp((xμ)22σ2)p(x) = \frac{1}{\sqrt{2\pi}\sigma} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)

E(X)=μ,Var(X)=σ2.E(X) = \mu \,,\quad Var(X) = \sigma^2\,.

It is also called the Normal random variable. Denote it by  N(μ,σ2)\ N(\mu, \sigma^2)

Gamma random variable

p(x)=1Γ(x)βαxα1ex/β,(0<x<, α>0, β>0)p(x) = \frac{1}{\Gamma(x) \beta^\alpha} x^{\alpha - 1}e^{-x/\beta}\,,\quad (0<x<\infty,\ \alpha>0,\ \beta>0)

E(X)=αβ,Var(X)=αβ2.E(X) = \alpha \beta \,,\quad Var(X) = \alpha \beta^2\,.

Properties of Expectation and Variance

The expectation of a function  f(X)\ f(X) with respect to a random variable  X\ X is

E(f(X))=f(k)p(k),discreteE(f(X))=Rf(x)p(x)dx,continuous\begin{array}{ll} E(f(X)) = \displaystyle\sum f(k) p(k)\,, & discrete \\ \\ E(f(X)) = \displaystyle\int_{\mathbb{R}} f(x) p(x)\,dx\,, & continuous \end{array}

If for arbitrary constants a and b,  P(Xa,Yb)=P(Xa)P(Xb)\ P(X \leqslant a, Y \leqslant b) = P(X \leqslant a) P(X \leqslant b), then the random variable are called independent.

Property

Let  X\ X and  Y\ Y be random variables and  c\ c and  d\ d be constants. Then,

  1. (Linearity)  E[cX+dY]=cE(X)+dE(Y)\ E[cX+dY] = cE(X)+dE(Y)
  2. (Schwarz inequality)  E(XY)(E(X2))1/2(E(Y2))1/2\ E(XY) \leqslant (E(X^2))^{1/2} (E(Y^2))^{1/2}
  3. (Preservation of order) If  XY\ X \leqslant Y then  E(X)E(Y)\ E(X) \leqslant E(Y).
  4.  Var(X)=EX2(EX)2\ Var(X) = EX^2 -(EX)^2
  5. If  X\ X and  Y\ Y are independent, then

E[XY]=E[X]E[Y]Var(X+Y)=Var(X)+Var(Y)Var(XY)=Var(X)+Var(Y)\begin{array}{rcl} E[XY] &=& E[X]E[Y] \\ Var(X+Y) &=& Var(X) + Var(Y) \\ Var(X-Y) &=& Var(X) + Var(Y) \end{array}

Proof

Var(X+Y)=E(X+Y)2(E(X+Y))2=(EX2+EY2+2E(XY))((EX)2+(EY)2+2EXEY)=(EX2+EY2+2EXEY)((EX)2+(EY)2+2EXEY)=(EX2(EX)2)+(EY2(EY)2)=Var(X)+Var(Y).\begin{array}{rlr} Var(X+Y) &= E(X+Y)^2 - (E(X+Y))^2 \\ &= (EX^2+EY^2 + 2E(XY)) - ((EX)^2 + (EY)^2 + 2EXEY) \\ &= (EX^2+EY^2 + 2EXEY) - ((EX)^2 + (EY)^2 + 2EXEY) \\ &= (EX^2 - (EX)^2) + (EY^2 - (EY)^2) \\ &= Var(X) + Var(Y). & \square \end{array}

Distribution of function of random variable

Suppose that a random variable  Y\ Y is a function of a random variable X i.e.  Y=f(x)\ Y = f(x).

Then the PDF of  Y\ Y can be determined by the PDF of X.

Examples

Given a random variable  Y=X2\ Y = X^2. If  X\ X has a PDF  pX(x)\ p_X(x), we need to find  pY(y)\ p_Y(y).

Denote the distribution function of  X\ X and  Y\ Y by  FX(x)\ F_X(x) and  FY(y)\ F_Y(y), respectively.

For  y>0\ y > 0,  FY(y)=P(Yy)=P(X2y)=P(yy)=FX(y)FX(y)\ F_Y(y) = P(Y \leqslant y) = P(X^2 \leqslant y) = P(-\sqrt{y} \leqslant \sqrt{y}) = F_X(\sqrt{y}) - F_X(-\sqrt{y})

by  pY(y)=dFY(y)dy\ p_Y(y) = \displaystyle\frac{dF_Y(y)}{dy}, it followed that

pY(y)={12y(pX(y)pX(y))y>00y0.p_Y(y) = \begin{dcases} \frac{1}{2\sqrt{y}} \big( p_X(\sqrt{y}) - p_X(-\sqrt{y}) \big) & y > 0 \\ 0 & y \leqslant 0 \,. \end{dcases}

When  X\ X is a Gaussian random variable  N(0,1)\ N(0, 1), its PDF is  pX(x)=12πex22\ p_X(x) = \displaystyle\frac{1}{\sqrt{2\pi}} e^{-\frac{x^2}{2}},

then  Y\ Y has a PDF

pY(y)={12πy12ey2y>00y0.p_Y(y) = \begin{dcases} \frac{1}{\sqrt{2\pi}} y^{-\frac{1}{2} } e^{-\frac{y}{2}} & y > 0 \\ 0 & y \leqslant 0 \,. \end{dcases}

This is just  χ2\ \chi^2 distribution with one degree of freedom.

Theorem

Let a random variable  X\ X have the PDF  pX(x)\ p_X(x) and let  g(x)\ g(x) be a differentiable function on  R\ \mathbb{R}. If  g(x)>0, (xR)\ g'(x) > 0\,, \ (x \in \mathbb{R}) and  a=limxg(x), b=limx+g(x)\ a = \lim\limits_{x \to -\infty} g(x)\,,\ b = \lim\limits_{x \to +\infty} g(x) then  Y=g(X)\ Y = g(X) has a PDF

pY(y)={pX(h(y))h(y)a<y<b0otherwise,h=g1p_Y(y) = \begin{dcases} p_X\big(h(y)\big) h'(y) & a < y < b \\ 0 & otherwise \end{dcases} \quad , h = g^{-1}

Proof

 g(x)\ g(x) is monotone function on  R\ \mathbb{R} and  a<g(x)<b\ a < g(x) <b, and then its reverse function exists  (y=g(x), x=h(y)=g1(x))\ \big( y = g(x)\,,\ x = h(y) =g^{-1}(x)\big)

FY(y)=P(g(X)y)=1,(y>b)FY(y)=P(g(X)y)=0,(y<a)FY(y)=P(g(X)y)=P(Xg1(y))=P(Xh(y))=FX(h(y)),(a<y<b)\begin{array}{rll} F_Y(y) &= P(g(X) \leqslant y) = 1\,,& (y > b) \\ F_Y(y) &= P(g(X) \leqslant y) = 0\,,& (y < a) \\ F_Y(y) &= P(g(X) \leqslant y) \\ &= P(X \leqslant g^{-1}(y) ) \\ &= P(X \leqslant h(y)) \\ &= F_X\big(h(y)\big)\,,& (a < y < b) \end{array}

Furthermore,  pY(y)=dFY(y)dypY(y)=pX(h(y))h(y)\ p_Y(y) = \displaystyle\frac{dF_Y(y)}{dy} \Rightarrow p_Y(y) = p_X(h(y))h'(y)

Characteristic function

The characteristic function of a random variable is defined by

φX(t)=E(eitx)=ReitxdFX(x)(=ReitxpX(x)dx)\varphi_X(t) = E(e^{itx}) = \int_{\mathbb{R}}e^{itx}\,dF_X(x) \left( = \int_{\mathbb{R}}e^{itx}p_X(x)\,dx\right)

Example

Let  X\ X be an exponential random variable with PDF

p(x)={λeλx, x00, x<0p(x) = \begin{dcases} \lambda e^{-\lambda x} &,\ x \geqslant 0 \\ 0 &,\ x < 0 \end{dcases}

Its characteristic function

ΦX(t)=Rp(x)eitxdx=λ0e(λit)xdx=λλit.\Phi_X(t) = \int_{\mathbb{R}}p(x)e^{itx}\,dx = \lambda\int_0^{\infty}e^{-(\lambda-it)x}\,dx = \frac{\lambda}{\lambda-it}\,.

Example

Let  X\ X be a normal random variable with mean  μ\ \mu and variance  σ2\ \sigma^2.

Then its characteristic function  ΦX(t)=eiμt12t2σ2\ \Phi_X(t) = e^{i\mu t - \frac{1}{2}t^2\sigma^2}

Theorem

Let  X\ X and  Y\ Y be two independent random variables with PDF  p(x)\ p(x),  q(y)\ q(y). Let  Z=X+Y\ Z = X + Y, then

  1.  ΦZ(t)=ΦX(t)ΦY(t)\ \Phi_Z(t) = \Phi_X(t) \Phi_Y(t)
  2.  r(z)=(pq)(z)\ r(z) = (p*q)(z), r(z)r(z) is the PDF of ZZ
Property

Let  W=cX\ W = cX,  c\ c is a constant, then

P(a<W<b)=P(ac<X<bc)=acbcp(x)dx=ab1cp(xc)dxP(a<W<b) = P\left(\frac{a}{c} < X < \frac{b}{c}\right) = \int_{\frac{a}{c}}^{\frac{b}{c}} p(x)\,dx = \int_a^b\frac{1}{c}p(\frac{x}{c})\,dx

This implies that  W\ W has a PDF  β(x)=1cp(xc)\ \beta(x) = \displaystyle\frac{1}{c}p(\frac{x}{c})

ΦW(t)=E(eitW)=Reitx1cp(xc)dx=Reictxp(x)dx=ΦX(ct)\Phi_W(t) = E(e^{itW}) = \int_{\mathbb{R}} e^{itx} \frac{1}{c}p(\frac{x}{c})\,dx = \int_{\mathbb{R}} e^{ictx}p(x)\,dx = \Phi_X(ct)

Theorem

Let the random variable  X1\ X_1 be  N(μ1,σ12)\ N(\mu_1, \sigma_1^2) and  X2\ X_2 be  N(μ2,σ22)\ N(\mu_2, \sigma_2^2).

If  X1\ X_1 and  X2\ X_2 are independent, then the random variable  X=X1+X2\ X = X_1 + X_2 is  N(μ,σ2)\ N(\mu, \sigma^2)

μ=μ1+μ2,σ2=σ12+σ22\mu = \mu_1 + \mu_2\,,\quad \sigma^2 = \sigma_1^2 + \sigma_2^2

Proof

ΦX(t)=ΦX1(t)ΦX2(t)ΦX1(t)=eiμ1tt2σ12/2ΦX2(t)=eiμ2tt2σ22/2ΦX(t)=ei(μ1+μ2)tt2(σ12+σ22)/2\Phi_X(t) = \Phi_{X_1}(t)\Phi_{X_2}(t) \\ \Phi_{X_1}(t) = e^{i\mu_1 t - t^2\sigma_1^2/2} \\ \Phi_{X_2}(t) = e^{i\mu_2 t - t^2\sigma_2^2/2} \\ \Phi_X(t) = e^{i(\mu_1+\mu_2)t - t^2(\sigma_1^2 + \sigma_2^2)/2}

This implies that  XN(μ1+μ2,σ12+σ22)\ X \sim N(\mu_1 + \mu_2, \sigma_1^2+\sigma_2^2)

Corollary

If  X1,,Xn\ X_1, \cdots , X_n are independent Gaussian random variables then any linear combination  a1X1++anXn\ a_1X_1+\cdots+a_nX_n is a Gaussian random variable.

Jointly distributed random variables

  • The joint distribution function of two random variables is defined by

    F(x,y)=P(Xx,Yy)F(x,y) = P(X \leqslant x, Y \leqslant y)

Marginal distribution functions

FX(x)=F(x,+)FY(y)=F(+,y)F_X(x) = F(x,+\infty) \\ F_Y(y) = F(+\infty,y)

  • if  X\ X and  Y\ Y is independent

    F(x,y)=FX(x)FY(y)F(x,y) = F_X(x)F_Y(y)

Let  X\ X and  Y\ Y be discrete random variables with  P(X=k)=p(k)\ P(X = k) = p(k),  P(Y=k)=g(k)\ P(Y = k) = g(k). The joint probability mass function(PMF) of  X\ X and  Y\ Y is defined by

γ(k,l)=P(X=k,Y=l)\gamma(k,l) = P(X = k, Y = l)

Property

lγ(k,l)=p(k)kγ(k,l)=g(l)\sum\limits_l\gamma(k,l) = p(k) \\ \sum\limits_k\gamma(k,l) = g(l)

In fact,

 lγ(k,l)=lP(X=k,Y=l)=P(X=k,YZ)=p(k)\ \sum\limits_l\gamma(k,l) = \sum\limits_lP(X = k, Y = l) = P(X = k, Y \in \mathbb{Z}) = p(k)

If  X\ X and  Y\ Y are independent then

P(X=k,Y=l)=P(X=k)P(Y=l)i.e.γ(k,l)=p(k)g(l)P(X = k, Y = l) = P(X = k)P(Y = l) \\ i.e.\qquad \gamma(k,l) = p(k)g(l)

Consider two continuous random variables  X\ X and  Y\ Y with PDF  p(x)\ p(x),  g(y)\ g(y).

If a non-negative intergrable function  γ(x,y)\ \gamma(x,y) exists such that  F(x,y)=xyγ(x,y)dxdy\ F(x,y) = \displaystyle\int_{-\infty}{x}\int_{-\infty}{y}\gamma(x,y)\,dxdy

then  γ(x,y)\ \gamma(x,y) is called the joint PDF of  X\ X and  Y\ Y

Given  Y=y\ Y=y, the conditional PDF of  X\ X is

γ(xy)=γ(x,y)g(y)\gamma(x|y) = \frac{\gamma(x,y)}{g(y)}

Let  u(X)\ u(X) be a function of  X\ X

Given  Y=y\ Y=y, the conditional expectation of  u(X)\ u(X) is defined as

E[u(X)y]=Ru(x)γ(xy)dxE[u(X)|y] = \int_{\mathbb{R}}u(x)\gamma(x|y)\,dx

Especially,

E[Xy]=Rxγ(xy)dxE[X|y] = \int_{\mathbb{R}}x\gamma(x|y)\,dx

Property

Let  X\ X and  Y\ Y have joint PDF  γ(x,y)\ \gamma(x,y), then  p(x)=Rγ(x,y)dy\ p(x) = \int_{\mathbb{R}} \gamma(x,y)\,dy,  g(y)=Rγ(x,y)dx\ g(y) = \int_{\mathbb{R}} \gamma(x,y)\,dx.

If  X\ X and  Y\ Y are independent, then their joint PDF  γ(x,y)=p(x)g(y)\ \gamma(x,y) = p(x)g(y)

The covariance of two random variables  Cov(X,Y)=E[(XEX)(YEY)]=E(XY)EXEY\ Cov(X,Y) = E[(X-EX)(Y-EY)] = E(XY) - EXEY

Property

Let  X,Y,Z\ X,Y,Z be random variables and  c,d\ c,d be constants, then

Cov(c,X)=0Cov(cX+dY,Z)=cCov(X,Z)+dCov(Y,Z)Cov(c,X) = 0 \\ Cov(cX+dY,Z) = cCov(X,Z)+dCov(Y,Z)

If  X\ X and  Y\ Y are independent,  Cov(X,Y)=0\ Cov(X,Y) = 0.

Theorem

Let two random variables  X\ X and  Y\ Y have the joint density function  pX1X2(x1,x2)\ p_{X_1X_2}(x_1,x_2).

Denote  A={(x1,x2)R2p(x1,x2)0}\ A = \{(x_1,x_2)\in\mathbb{R}^2 | p(x_1, x_2) \neq 0 \}

Two bivariate differentiable functions  g1(x1,x2),g2(x1,x2)\ g_1(x_1, x_2), g_2(x_1,x_2) are such that

U:y1=g1(x1,x2),y2=g2(x1,x2)U: y_1 = g_1(x_1, x_2)\,,\quad y_2 = g_2(x_1,x_2)

Denote the inverse transform  U1:x1=h1(y1,y2),x2=h2(y1,y2)\ U^{-1}: x_1 = h_1(y_1,y_2)\,,\quad x_2 = h_2(y_1,y_2)

Then  Y1=g1(X1,X2),Y2=g2(X1,X2)\ Y_1 = g_1(X_1,X_2)\,,\quad Y_2 = g_2(X_1,X_2) have the joint PDF

pY1Y2(y1,y2)={pX1X2(h1(y1,y2),h2(y1,y2))J(y1,y2)(y1,y2)B0otherwisep_{Y_1Y_2}(y_1,y_2) = \begin{dcases} p_{X_1X_2}(h_1(y_1,y_2), h_2(y_1,y_2))|J(y_1,y_2)| & (y_1,y_2)\in B \\ 0 & otherwise \end{dcases}

The joint distribution of random variables  X1,,Xn\ X_1, \cdots, X_n is defined as  F(x1,,xn)=P(X1x1,,Xnxn)\ F(x_1, \cdots, x_n) = P(X_1 \leqslant x_1, \cdots, X_n \leqslant x_n)

If each  Xk\ X_k is discrete, then their joint PMF  p(k1,,km)=P(X1=k1,,Xn=kn)\ p(k_1,\cdots, k_m) = P(X_1 = k_1, \cdots, X_n = k_n)

If each  Xk\ X_k is continuous, then there exists  p(x1,,xn)\ p(x_1, \cdots, x_n) such that  F(x1,,xn)=x1xnp(x1,,xn)dx1dxn\ F(x_1,\cdots, x_n) = \int_{-\infty}^{x_1}\cdots \int_{-\infty}^{x_n}p(x_1,\cdots , x_n)\, dx_1\cdots dx_n.

 p(x1,,xn)\ p(x_1,\cdots,x_n) is the joint PDF of  X1,,Xn\ X_1, \cdots, X_n

Suppose that  X1,,Xn\ X_1, \cdots, X_n have joint PDF  γ(x1,,xn)\ \gamma(x_1,\cdots,x_n)

the conditional PDF  γ(x1,,xmxm+1,,xn)=γ(x1,,xn)γ(xm+1,,xn)\ \gamma(x_1, \cdots, x_m | x_{m+1},\cdots, x_n) = \displaystyle\frac{\gamma(x_1, \cdots, x_n)}{\gamma(x_{m+1}, \cdots, x_n)}

Take the transform  Y1=g1(X1,,Xn)  Y2=g2(X1,,Xn)\ Y_1 = g_1(X_1,\cdots,X_n)\ \cdots \ Y_2 = g_2(X_1,\cdots,X_n). The inverse transform  X1=h1(Y1,,Yn)  Xn=hn(Y1,,Yn)\ X_1 = h_1(Y_1,\cdots,Y_n)\ \cdots\ X_n = h_n(Y_1,\cdots,Y_n)

The joint PDF of  Y1,,Yn\ Y_1, \cdots, Y_n is  pY(y1,,yn)=pX(h1(y1,,yn),,hn(y1,,yn))J(y1,,yn)\ p_Y(y_1, \cdots, y_n) = p_X(h_1(y_1,\cdots,y_n), \cdots, h_n(y_1,\cdots,y_n)) |J(y_1, \cdots, y_n)|

Central Limit Theorem

Defination

Let  {Xn}nZ+\ \{X_n\}_{n\in\mathbb{Z}^{+}} be a sequence of random variables and  X\ X be a random variable.

  1.  {Xn}nZ+\ \{X_n\}_{n\in\mathbb{Z}^{+}} converges to  X\ X in probability if

    ε>0, limn+P(XnXε)=0\forall \varepsilon > 0\,,\ \lim\limits_{n\rightarrow + \infty}P(\lvert X_n - X \rvert \geqslant \varepsilon) = 0

    Denote  XnpX\ X_n\stackrel{p}\longrightarrow X

  2.  {Xn}nZ+\ \{X_n\}_{n\in\mathbb{Z}^{+}} converges to  X\ X in mean square sense if

E[Xn2]<+ and limn+E[XnX2]=0E[X_n^2] < +\infty\ \text{and}\ \lim\limits_{n\rightarrow + \infty} E[\lvert X_n - X \rvert^2] = 0

Denote  Xnm.s.X\ X_n\stackrel{m.s.}\longrightarrow X

  1.  {Xn}nZ+\ \{X_n\}_{n\in\mathbb{Z}^{+}} converges to  X\ X in distribution if

    x, limn+FXn(x)=FX(x)\forall x \,,\ \lim\limits_{n\rightarrow + \infty}F_{X_n}(x) = F_X(x)

    Denote  XndX\ X_n\stackrel{d}\longrightarrow X

Property

If  Xnm.s.X\ X_n \stackrel{m.s.}\longrightarrow X then  XnpX\ X_n \stackrel{p}\longrightarrow X

If  XnpX\ X_n \stackrel{p}\longrightarrow X then  XndX\ X_n \stackrel{d}\longrightarrow X

Property

A sequence  {Xn}nZ+\ \{X_n\}_{n \in \mathbb{Z}^+} of random variables converges to a random variable  X\ X in distribution if and only if their characteristic functions satisfy  ΦXn(t)ΦX(t)\ \Phi_{X_n}(t) \rightarrow \Phi_{X}(t)

Theorem

Suppose that  {Xn}nZ+\ \{X_n\}_{n \in \mathbb{Z}^+} is a sequence of independent and identically distributed(i.i.di.i.d) random variables and each  Xn\ X_n has the expectation  μ\ \mu and variance  σ2\ \sigma^2.

Let  Sn=k=1nXk\ S_n = \sum\limits_{k = 1}^{n} X_k, then the sequence of random variables  Snnμn\ \displaystyle\frac{S_n - n\mu}{ \sqrt{n}} converges to a Gaussian random variable  XN(0,σ2)\ X \sim N(0,\sigma^2)