muchen 牧辰

Discrete Random Vectors

Updated 2017-11-30

We are interested in joint behavior of multiple random variables. An example of random vector would be rolling two dies and we are looking at the joint behavior of two random variables.

x=(X1X2)=(sum of pointsdifference of points)

There are continuous and discrete random variables.

Discrete Random Vectors

Joint Probability Mass Function

The probability for a realization of all considered random variables.

f(x1x2,,xm)=P(X1=x1,X2=x2,,Xm=xm)

Example: consider the random vector from above (rolling two dies)

Out of all possibilities, sum of points being 6 has 6 possible outcomes: (1, 5), (2, 4), (3, 3), (4, 2), (1, 6). Thus the probability of X1=6 is 536. Then out of those possibilities, only (3, 3) has the absolute difference 0, so P(X2=0|X1=6)=15.

Therefore the joint probabilities is P(X1=6,X2=0)=136

Marginal Densities

PMF that only takes interest in a single random variable in a random variable.

For instance if the random vector has 2 elements, then f(x1,x2) is the discrete joint PMF function.

Then the marginal densities is the PMF with the other variables (not of interest) summed up.

f1(x1)=x2f(x1,x2) f2(x2)=x1f(x1,x2)

Mean of Random Vector

The mean/expected value of a random vector just applies to each individual random variables.

E(X)=E(X1Xm)=E(E(X1)E(Xm))=μ

Covariance

The variance of the random vector can be described as covariance, in a covariance matrix.

Cov(X)=E[(Xμ)(Xμ)T]

Where Xμ the random vector subtracting the mean of the random vector, and (Xμ)T is the transpose. This results in a matrix multiplication that returns a matrix:

Cov(X)=[σ11σ12σ13σ1mσ21σ22σ23σ2mσm1σm2σm3σmm]

Where for any i,

σii=E[(Xiμi)2]=Var(Xi)

And for any i and j,

σij=E[(Xiμi)(Xjμj)]=Cov(Xi,Xj)

The covariance of the two individual random variable is

Cov(Xi,Xj)=E[(Xiμi)(Xjμj)]=E[(Xiμi)]E[(Xjμj)]

Computing Covariance

σij=E[(Xiμi)(Xjμj)]=E[XiXj]E[Xi]μjE[Xj]μi+μiμj=E[XiXj]μiμjμiμj+μiμj=E[XiXj]μiμj

Correlation Coefficient

It describes how correlated two random variables are. It is defined as follows.

ρij=σijσiiσjj=Cov(Xi,Xy)SD(Xi)SD(Xj)

Where ρ[1,1].

Independent Random Variables

The random variables X1,X2,,Xm are independent from each other if and only if

f(x1,x2,,xm)=f1(x1)f2(x2)fm(xm)

Consider random variables X and Y, if they are independent then,

E(XiXj)=E(Xi)E(Xj)

Covariance is 0 if they are independent.

σXY=E(XY)E(X)E(Y)=0

Note that independence of X and Y σXY=0, but σXY that X$andY$$ are independent.

Conditional PMF

Consider two random variables X1 and X2. Then the conditional PMF is:

f(x2|x1)=f(x1,x2)f1(x1)

Thus, it’s clear to see that

f(x1,x2)=f1(x1)f(x2|x1)f(x1,x2)=f2(x2)f(x1|x2)

Independence

If X1 and X2 are independent, then

f(x1,x2)=f1(x1)f2(x2)

It follows that

f(x2|x1)=f2(x2)

and similarly

f(x1|x2)=f1(x1)

Conditional Mean and Variance

Conditional Mean is the expected value of one random variable given the realization of another random variable.

μy|x=E(Y|X=x)=yyf(y|x)

Conditional Variance:

σy|x2=Var(Y|X=x)=y(yμy|x)2f(y|x)=E(Y2|X=x)μy|x2

Independence

If events X and Y are independent, then:

Conditional Mean As A Function

Generally the conditional mean is expressed as E(Y|X=x), which is a function of x. Thus, we can express that as:

h(x)=E(Y|X=x)

Since x is only a realization of a random variable, we can consider h(x) as a random function as:

h(X)=E(Y|X)

Two Step Average

We may find the conditional mean of a random variable in two steps.

E(Y)=E(E(Y|X))

Where E[Y]=E[g(X,Y)] and E[E[Y|X]]=E[E[g(X,Y)|X]]. It follows that

=x(yg(x,y)f(y|x))fX(x)

Total Variance

Consider the following plot of two random variables X and Y. Where the red circles represent E(Y|X=x) and the red whiskers represents the variance Var(Y|X=x). Then the blue circle is the overall mean E(Y) and the blue whisker is the overall variance Var(Y).

In this case, the Total Variance is given by

Var(Y)=Unexplained variance+Explained variance=E[Var(Y|X)]+Var[E(Y|X)]

In the example above, the unexplained variance is the inner variance (red whiskers). The explained variance is the variance due to the red dots increasing.

The Percentage of Explained Variance is given by:

Explained varianceTotal variance×100%

This gives an idea of how good a prediction is. If the percentage of explained variance is closer to 1, then one could use it for a good prediction. On the other hand, a percentage of near 0 is useless.

For the best prediction, use g(x)=E(Y|X=x).