Summary


Discrete Random Variable Expected Value

Definition

Let 𝑋 be a discrete random variable. Then the expected value of 𝑋 is

𝔼[𝑋]=𝑥𝑥(𝑋=𝑥)

Note

This can also be called the weighted average.

The expected value (expectation) is well-defined only when the sum is absolutely convergent, i.e. 𝑥|𝑥|(𝑋=𝑥)<.

Example (Direct Computation)

𝑋Ber(𝑝). 𝔼[𝑋]=1(𝑋=1)+0(𝑋=0)=𝑝+0(1𝑝)=𝑝.

Example (Manipulation of Sums)

Let 𝑋Bin(𝑛,𝑝). 𝔼[𝑋]=𝑘0𝑘(𝑋=𝑘)=𝑘0𝑘(𝑛𝑘)𝑝𝑘(1𝑝)𝑛𝑘. The 𝑘=0 term vanishes, and for 𝑘1 we use 𝑘(𝑛𝑘)=𝑛(𝑛1𝑘1):

=𝑛𝑘=1𝑛(𝑛1𝑘1)𝑝𝑘(1𝑝)𝑛𝑘=𝑛𝑝𝑛𝑘=1(𝑛1𝑘1)𝑝𝑘1(1𝑝)𝑛𝑘

Substituting 𝑗=𝑘1:

=𝑛𝑝𝑛1𝑗=0(𝑛1𝑗)𝑝𝑗(1𝑝)𝑛1𝑗=𝑛𝑝(𝑝+(1𝑝))𝑛1=𝑛𝑝

by the binomial theorem.

Example (Derivatives of Sums)

𝑋Geom(𝑝) (see geometric RVs). (𝑋=𝑘)=(1𝑝)𝑘1𝑝. Let 𝑞=1𝑝. Then:

𝔼[𝑋]=𝑝𝑘1𝑘𝑞𝑘1

Recall the geometric series 𝑘0𝑞𝑘=11𝑞=1𝑝 for |𝑞|<1. Differentiating both sides with respect to 𝑞:

𝑘1𝑘𝑞𝑘1=𝑑𝑑𝑞(11𝑞)=1(1𝑞)2=1𝑝2

Therefore 𝔼[𝑋]=𝑝1𝑝2=1𝑝.

Example (Taylor Expansion)

𝑋Pois(𝜆) (see Poisson RVs). Recall (𝑋=𝑘)=𝑒𝜆𝜆𝑘𝑘! for 𝑘0.

𝔼[𝑋]=𝑘0𝑘𝑒𝜆𝜆𝑘𝑘!=𝑒𝜆𝑘1𝜆𝑘(𝑘1)!=𝜆𝑒𝜆𝑗0𝜆𝑗𝑗!=𝜆𝑒𝜆𝑒𝜆=𝜆

where we substituted 𝑗=𝑘1 and recognized the Taylor series for 𝑒𝜆.

Continuous Random Variable Expected Value

Definition

If 𝑋 is a continuous random variable with probability density function 𝑓𝑋(𝑥), then the expected value of 𝑋 is defined as

𝔼[𝑋]=𝑥𝑓𝑋(𝑥)𝑑𝑥

under the assumption

|𝑥|𝑓𝑋(𝑥)𝑑𝑥<

Example

𝑋Unif([𝑎,𝑏]). 𝑥𝑓𝑋(𝑥)𝑑𝑥=𝑏𝑎𝑥1𝑏𝑎𝑑𝑥=𝑏2𝑎22(𝑏𝑎)=𝑏+𝑎2

Example

Compute 𝔼[𝑋] when 𝑋Exp(𝜆) (see exponential RVs).

Example

𝑋𝑁(𝜇,𝜎2). 𝑓𝑋(𝑥)=12𝜋𝜎𝑒(𝑥𝜇)22𝜎2. Substitute 𝑢=𝑥𝜇:

𝔼[𝑋]=(𝑢+𝜇)12𝜋𝜎𝑒𝑢22𝜎2𝑑𝑢 =𝑢12𝜋𝜎𝑒𝑢22𝜎2𝑑𝑢=0(odd function)+𝜇12𝜋𝜎𝑒𝑢22𝜎2𝑑𝑢=1(PDF integrates to 1)=𝜇

Example

Let 𝑋Cauchy(𝛾), that is a continuous RV with p.d.f.

𝑓𝑋(𝑥)=𝛾𝜋(𝑥2+𝛾2)𝑥 𝔼[𝑋]=|𝑥|𝛾𝜋(𝑥2+𝛾2)𝑑𝑥=?

This is not finite when computed with |𝑥|. This is because the expression is asymptotic to 𝐶𝑥 which is an infinite integral.

By this example, if 𝑋Cauchy(𝛾) then 𝔼[𝑋] is not defined. Also, 𝛾𝜋(𝑥2+𝛾2)𝑑𝑥=1.

Question

How can we compute 𝔼[𝑔(𝑥)] where 𝑔:?

Theorem

Let 𝑋1,,𝑋𝑛 be 𝑛 discrete random variables with joint p.m.f. 𝑓. Set 𝑌=𝑔(𝑋1,,𝑋𝑛) where 𝑔:𝑛. Then

𝔼[𝑌]=𝑥1,,𝑥𝑛𝑔(𝑥1,,𝑥𝑛)𝑓(𝑥1,,𝑥𝑛)

Theorem

Let (𝑋1,,𝑋𝑛) be a continuous random vector with joint p.d.f. 𝑓. Set 𝑌=𝑔(𝑋1,,𝑋𝑛) where 𝑔:𝑛. Then

𝔼[𝑌]=𝑛𝑔(𝑥1,,𝑥𝑛)𝑓(𝑥1,,𝑥𝑛)𝑑𝑥1𝑑𝑥𝑛

Linearity of Expectation

Theorem

Let 𝑋1,,𝑋𝑛 be random variables (discrete or continuous). Then

𝔼[𝛼1𝑋1++𝛼𝑛𝑋𝑛]=𝛼1𝔼[𝑋1]++𝛼𝑛𝔼[𝑋𝑛]

Example

𝑋Binom(𝑛,𝑝). Recall 𝑋=𝑋1++𝑋𝑛 where 𝑋𝑖Ber(𝑝) (see Bernoulli RVs). Hence 𝔼[𝑋]=𝔼[𝑋1]+=𝑛𝑝.

Example

Compute 𝔼[𝑋] where 𝑋Gamma(𝑛,𝜆) (see gamma RVs).

The Method of Indicator Functions

Definition

For an event 𝐴Ω, we say the indicator function on 𝐴 is the random variable defined by

𝟙𝐴(𝜔)={1if𝜔𝐴0if𝜔𝐴

Notice that 𝟙𝐴Bernoulli(𝑝) where 𝑝=(𝟙𝐴=1)=(𝐴). Additionally, 𝔼[𝟙𝐴]=(𝐴). Finally, if 𝑋=𝟙𝐴1++𝟙𝐴𝑛, then 𝔼[𝑋]=(𝐴1)++(𝐴𝑛) by linearity.

Example

Suppose we have 𝑛 letters and 𝑛 envelopes. What is the expected number of letters in the correct envelope? 𝑋=𝑛𝑖=1𝟙𝐴𝑖, where 𝐴𝑖={Letter i goes to envelope i}. By linearity:

𝔼[𝑋]=𝑛𝑖=1𝔼[𝟙𝐴𝑖]=𝑛𝑖=1(𝐴𝑖)=𝑛𝑖=11𝑛=1

Example

Consider 𝑛 balls dropped at random in 𝑛 boxes. 𝑋=# of empty boxes. Compute 𝔼[𝑋].

Let 𝐴𝑖={Box i is empty}. 𝑋=𝑛𝑖=1𝟙𝐴𝑖.

𝔼[𝑋]=𝑛𝑖=1𝔼[𝟙𝐴𝑖]=𝑛𝑖=1(𝐴𝑖)=𝑛𝑖=1(11𝑛)𝑛=𝑛(11𝑛)𝑛𝑛𝑒

Example

Toss 𝑛 times a 𝑝-coin. An 𝐻-run is a sequence of consecutive heads. 𝑋={# of H-runs}.

𝐴𝑖={An H-run starts at position i}. 𝑋=𝑛𝑖=1𝟙𝐴𝑖. (𝐴1)=𝑝. (𝐴𝑖)=(1𝑝)𝑝𝑖2.

𝔼[𝑋]=𝑛𝑖=1(𝐴𝑖)=𝑝+(𝑛1)𝑝(1𝑝)

Notice that in these examples we are abusing linearity of expectation in a very useful way.

Conditional Expectation

Theorem

The tail integral formula for expectation gives the following. Let 𝑋 be a non-negative random variable ((𝑋0)=1).

𝔼[𝑋]=0(𝑋𝑡)𝑑𝑡

Corollary

The discrete version is as follows. If 𝑋 is non-negative discrete random variable,

𝐸[𝑋]=𝑘=1(𝑋𝑘)

Definition

If 𝑋=(𝑋1,,𝑋𝑛) is a random vector, the mean of a random vector is

𝔼[𝑋](𝔼[𝑋1],,𝔼[𝑋𝑛])

Example

Let 𝐴 be a 𝑚×𝑛 matrix, and 𝜇𝑚. Define 𝑌=𝜇+𝐴𝑋𝑚. Compute 𝔼[𝑌]=𝜇+𝐴𝔼[𝑋]

Definition

Let 𝑋,𝑌 be two random variables that are discrete. The conditional expectation of 𝑌 given {𝑋=𝑥} is

𝔼[𝑌|𝑋=𝑥]=𝑦𝑦(𝑌=𝑦|𝑋=𝑥)

where (𝑌=𝑦|𝑋=𝑥) is the conditional p.m.f.

Definition

If 𝑋,𝑌 are continuous random variables, the conditional expectation is

𝔼[𝑌|𝑋=𝑥]=𝑦𝑓𝑌|𝑋=𝑥(𝑦)𝑑𝑦=𝑦𝑓𝑋,𝑌(𝑥,𝑦)𝑓𝑋(𝑥)𝑑𝑦

using the conditional p.d.f.

Example

𝑌=The outcome of a die with 6 faces. 𝑋=𝟙{The outcome is even}.

𝔼[𝑌]=6𝑦=1𝑦(𝑌=𝑦)=1(𝑌=1)+=72𝔼[𝑌|𝑋=0]=6𝑦=1𝑦(𝑌=𝑦|𝑋=0)=3

In this example, we demonstrated that it is actually quite common for the expectation to have zero probability, especially with discrete random variables!

Conditional Expectation as a Random Variable

This concept is very confusing, so do not worry if it seems nonsensical.

Definition

The conditional expectation of 𝑌 given 𝑋 is

𝔼[𝑌|𝑋]=𝑔(𝑋)

This is a new random variable.

In the previous example, 𝑔(𝑋)=𝑔(𝟙{The outcome is even})=4𝟙{The outcome is even}+3𝟙{The outcome is odd}.

The moral is that 𝔼[𝑌|𝑋] is the best approximation for 𝑌 using only information in 𝑋.

Theorem

𝔼[𝔼[𝑌|𝑋]]=𝔼[𝑌]