# CS代考 F71SM STATISTICAL METHODS – cscodehelp代写

F71SM STATISTICAL METHODS

5 MULTIVARIATE DISTRIBUTIONS AND LINEAR COMBINATIONS

5.1 Introduction — several random variables at once

The concepts and descriptions of random variables introduced in section 3 all extend to dis- tributions of several random variables defined simultaneously on a joint sample space — these give us vector random variables, or multivariate distributions. In 2-dimensions we have a pair of r.v.s (X, Y ) with cdf (cumulative distribution function) FX,Y (x, y), or just F (x, y), whereF(x,y)=P(X≤xandY ≤y).

Discrete case: pmf (probability mass function) fX,Y (x, y) or just f (x, y), where f (x, y) = P(X=x,Y =y)

→ probabilities of events defined on the r.v.s are evaluated using double sums

Continuous case: pdf ( probability density function) fX,Y (x, y), or just f (x, y) → probabilities of events defined on the r.v.s are evaluated using double integrals

f(x,y) = ∂x∂yF(x,y), F(x,y) =

5.2 Expectations, product moments, covariance, correlation

E[h(X,Y)] =

∞∞ h(x,y)f(x,y) or

h(x,y)f(x,y)dxdy

∞∞ Mean of X: μX = E[X] = xf(x,y) or

xf(x,y)dxdy and similarly for μY ,

E[X2], E[Y 2], and variances σX2 , σY2 and so on.

f(s,t)dsdt

Product moments (about the origin): E[XrY s] = Product moments (about the means):

∞∞ xr ys f(x, y) or

xr ys f(x, y) dx dy

∞∞ (x−μX)r(y−μY)sf(x,y)or

E[(X−μX)r(Y −μY)s]=

ThecovariancebetweenX andY: Cov[X,Y]=E[(X−μX)(Y −μY)]=E[XY]−μXμY

The correlation coefficient between X and Y : ρXY = Corr[X, Y ] = Cov[X, Y ] σX σY

(x−μX)r(y−μY)sf(x,y)dxdy

Note: Cov[X, X] = Var[X]

5.3 Association/linear relationships

The covariance Cov[X, Y ] is a measure of the association between X and Y ; that is, it indicates the strength of the linear relationship between X and Y (it also gives the direction of any relationship: positive or negative or zero) — it is measured in the units of xy.

Useful results: Cov[aX + b, cY + d] = acCov[X, Y ] Cov[X,Y +Z]=Cov[X,Y]+Cov[X,Z]

The correlation coefficient is a dimensionless measure of the strength of the association between X and Y ; it has no units of measurement and lies in the range −1 ≤ ρXY ≤ 1.

ρXY = 1 ⇔ perfect positive linear relationship, that is Y = a + bX with b > 0 ρXY = 0 ⇔ no linear relationship

ρXY = −1 ⇔ perfect negative linear relationship, that is Y = a + bX with b < 0
Change of units: U = a + bX, V = c + dY where b, d > 0, then Corr[U, V ] = Corr[X, Y ] Two r.v.s X, Y with Cov[X, Y ] = 0 have ρXY = 0 and are said to be uncorrelated.

Simulated data (200 values in each case) from 2-d r.v.s with various correlations.

Left: ρ = 0 (r = −0.095); center: ρ = +0.9 (r = +0.881); right: ρ = −0.7 (r = −0.746). (r gives the sample correlation coefficient — the observed sample equivalent of ρ)

5.4 Marginal distributions

The distribution of a single r.v. on its own in this context is called a marginal distribution.

Marginal distribution of X:

discrete case fX(x) = Similarly for Y .

∞ f(x,y) = P(X = x), continuous case fX(x) =

To find moments of X, or expectations of functions of X, we can use the joint pmf/pdf or 2

the marginal pmf/pdf, since, for example,

E[g(X)] = g(x)f(x,y) = g(x)f(x,y) = g(x)fX(x)

5.5 Conditional distributions

ForY givenX=x:fY|x(y|x)=f(x,y)forxsuchthatfX(x)̸=0 fX (x)

In discrete case, fY |x(y|x) = P (Y = y|X = x)

(we can drop the subscript Y |x if the context is clear).

The conditional mean of Y given X = x is the mean of the conditional distribution, de-

∞ notedE[Y|X =x]orjustE[Y|x]orμY|x,givenbyE[Y|X =x]= yf(y|x)or

E[Y|X = x] is a function of x.

The conditional expectation of h(Y ) given X = x is denoted E[h(Y )|X = x] or just

∞ h(y)f(y|x)or

h(y)f(y|x)dy. Afunctionofx. The conditional variance of Y given X = x is the variance of the conditional distribution,

denotedVar[Y|x]orσY|x,givenbyσY|x =E Y −μY|x X=x =E Y |X=x −μY|x 5.6 Independence

X and Y are independent random variables ⇔ fX,Y (x, y) = fX (x)fY (y) for all (x, y) within their range.

In this case:

ForsetsCandD,P(X∈C,Y ∈D)=P(X∈C)P(Y ∈D)

E[XY] = xyfX,Y (x,y)dxdy = xyfX(x)fY (y)dxdy = xfX(x)dx yfY (y)dy xyxyxy

= E[X]E[Y ]

⇒ Cov[X, Y ] = 0 ⇒ Corr[X, Y ] = 0.

So independence ⇒ zero correlation (note: the converse does not hold)

Worked Example 5.1 A fair coin is tossed three times. Let X be the number of heads in the first two tosses and let Y be the number of tails in all three tosses. (X, Y ) is discrete. The experiment has 8 equally likely outcomes, which are given below with the corresponding values of the variables:

HHH HHT HTH THH HTT THT TTH TTT (x,y) (2,0) (2,1) (1,1) (1,1) (1,2) (1,2) (0,2) (0,3)

The joint probability mass function and marginal distributions are as follows. 3

E[h(Y)|x],givenbyE[h(Y)|X=x]=

2 2 2 2 2

yf(y|x)dy.

0 0 0 1/8 1/8 1/4 X 1 0 2/8 2/8 0 1/2 2 1/8 1/8 0 0 1/4

1/8 3/8 3/8 1/8

P (X = Y ) = 1/4, P (X > Y ) = 1/4

μX = 0×1/4+1×1/2+2×1/4 = 1, E[X2] = 0×1/4+12 ×1/2+22 ×1/4 = 3/2, σX2 = 1/2

Similarly μY = 3/2, σY2 = 3/4

X ∼ b(2, 1/2), Y ∼ b(3, 1/2)

P(X=0,Y =0)=0whereasP(X=0)P(Y =0)=1/4×1/8=1/32,soXandY are not independent.

Joint moments: The product XY takes values 0, 1, 2 with probabilities 3/8, 2/8, 3/8 respec- tively, so

E[XY ] = 0 × 3/8 + 1 × 2/8 + 2 × 3/8 = 1 ⇒ Cov[X, Y ] = 1 − 1 × 3/2 = −1/2

⇒ Correlation coefficient ρ = √ −1/2 = −0.817 (Note negative correlation – higher values

(1/2)×(3/4)

of X are associated with lower values of Y and vice-versa).

Conditional distributions: Consider, for example, the distribution of Y given X = 1.

P(Y =0|X=1)=0,P(Y =1|X=1)=(2/8)/(1/2)=1/2,P(Y =2|X=1)=1/2,

P(Y =3|X=1)=0

i.e. fY |x(1|1) = fY |x(2|1) = 1/2.

The mean of this conditional distribution is the conditional expectation

E[Y|X =1]=1×P(Y =1|X =1)+2×P(Y =2|X =1)=1×1/2+2×1/2=3/2.

Similarly E[Y|X = 0] = 5/2 and E[Y|X = 2] = 1/2

Worked Example 5.2 Let (X, Y ) have joint pdf f(x, y) = e−(x+y), x > 0, y > 0.

P(X

so X and Y are independent.

It follows that E[XY ] = E[X]E[Y ] = 1 × 1 = 1

5.7 More than 2 random variables

The definitions and results above can be extended to cases in which we have 3 or more r.v.s. An important generalisation we require here is the definition of independence for a collection

of n r.v.s.

Let X = (X1,X2,…,Xn) be a collection of n r.v.s with joint pmf/pdf fX(x1,x2,…,xn)

and with marginal pmfs/pdfs f1(x1), f2(x2), . . . , fn(xn).

Then the n r.v.s are independent ⇔ fX(x1,x2,…,xn) = f1(x1)f2(x2)···fn(xn) for all

(x1, x2, . . . , xn) within their range. In this case:

• the variables are pairwise independent, that is Xi and Xj are independent, i, j = 1, 2, . . . , n, i ̸= j

• E[g1(X1)g2(X2)···gn(Xn)]=E[g1(X1)]E[g2(X2)]···E[gn(Xn)] • and, in particular, E[X1X2 · · · Xn] = E[X1]E[X2] · · · E[Xn]

Note: pairwise independence does not imply joint independence.

5.8 Linear combinations of random variables

Mean and variance

E[aX+bY] =

Var[aX+bY] =

aE[X] + bE[Y ] Cov[aX+bY,aX+bY]=a2Var[X]+b2Var[Y]+2abCov[X,Y] X,Y uncorrelated⇒Var[aX+bY]=a2Var[X]+b2Var[Y]

E aiXi = aiE[Xi] i=1 i=1

Var aiXi = a2iVar[Xi]+aiajCov[Xi,Xj]

i=1 i=1 i j̸=i nn

X1,X2,…,Xn independent ⇒ Var aiXi =a2iVar[Xi] i=1 i=1

X, Y independent ⇒ nn

Important special case:

E[X +Y] = Var[X +Y] = X, Y independent ⇒

Var[X]+Var[Y]+2Cov[X,Y]

X, Y uncorrelated ⇒ Var[X + Y ] = Var[X] + Var[Y ]

Pgfs: X, Y independent, S = X + Y ⇒ GS (t) = GX (t)GY (t) (extends to n r.v.s) 5

Mgfs: X, Y independent, S = X + Y ⇒ MS (t) = MX (t)MY (t) (extends to n r.v.s)

In the case that X, Y are independent r.v.s with probability mass functions fX , fY respec-

tively, we have

fX+Y(s) = P(X+Y =s)=P(X=x,Y =s−x)=fX(x)fY(s−x)

= fX (s − y)fY (y) y

The mass function of X + Y is called the convolution of the mass functions of X and Y . The concept extends to the sum of n independent r.v.s.

Standard distributions:

• X∼b(n,p),Y ∼b(m,p)withX,Y independent⇒X+Y ∼b(n+m,p)

• X∼P(λ1),Y ∼P(λ2)withX,Y independent⇒X+Y ∼P(λ1+λ2)

• X,Y ∼exp(λ)withX,Y independent⇒X+Y ∼gamma(2,λ)

•X∼N(μX,σX2),Y∼N(μY,σY2)withX,Yindependent⇒X+Y∼N(μX+μY,σX2 +σY2) andX−Y∼N(μX−μY,σX2 +σY2)

• X∼χ2n,Y ∼χ2m withX,Y independent⇒X+Y ∼χ2n+m

Worked example 5.3 Apples of a certain variety have weights which are normally dis- tributed about a mean of 120g and with standard deviation 10g. Oranges of a certain variety have weights which are normally distributed about a mean of 130g and with standard deviation 15g. I buy 4 apples and 2 oranges. Find (a) the probability that the total weight of my fruit exceeds 700g, and (b) the symmetrical interval containing 95% probability for the total weight of my fruit.

Let X(Y ) be the weight of an apple (orange) to be purchased. X ∼ N(120, 102), Y ∼ N(130, 152)

Let W be the total weight of my fruit. Then

W = X1+X2+X3+X4+Y1+Y2

where the Xi’s are i.i.d. copies of X, the Yi’s are i.i.d. copies of Y , and the Xi’s and Yi’s are

independent.

E[W] = 4 × 120 + 2 × 130 = 740 Var[W ] = 4 × 100 + 2 × 225 = 850

W is the sum of six independent normal variables, and so itself is a normal variable, so W ∼N(740,850)

(a)P(W>700)=P Z> √ √

700−740

=P(Z>−1.372)=0.9150 (b) 740 ± (1.96 × 850) i.e. 683g to 797g.

Further worked examples

We investigate whether or not the sum and difference of two random variables are corre- lated.LetXandY berandomvariablesandletU=X+Y andW=X−Y.

Cov[U,W] = Cov[X+Y,X−Y]=Cov[X,X]+Cov[X,−Y]+Cov[Y,X]+Cov[Y,−Y] = Var[X] − Var[Y ]

⇒ U and W are correlated unless X and Y have equal variances. (X,Y)haspdff(x,y)=x2+xy,0