¿Cuál es la diferencia entre

18

En general, ¿cuál es la diferencia entre E ( X | Y )E(X|Y) y E ( X | Y = y )E(X|Y=y) ?

Primero es función de yy y último es función de xx ? Es muy confuso ...

신범준
fuente
Hmmm ... ¡Esto último no debería ser una función de x sino un número! ¿Me equivoco?
David

Respuestas:

23

En términos generales, la diferencia entre E ( X Y )E(XY) y E ( X Y = y )E(XY=y) es que la primera es una variable aleatoria, mientras que la segunda es (en cierto sentido) una realización de E ( X Y )E(XY) . Por ejemplo, si ( X , Y ) N ( 0 , ( 1 ρ ρ 1 ) )

(X,Y)N(0,(1ρρ1))
entonces E ( X Y )E(XY) es la variable aleatoria E ( X Y ) = ρ Y .
E(XY)=ρY.
Por el contrario, una vez quese observa Y = y,Y=y es más probable que nos interese la cantidad E ( X Y = y ) = ρ yE(XY=y)=ρy que es un escalar.

Tal vez esto parezca una complicación innecesaria, pero considerar E ( X Y )E(XY) como una variable aleatoria por derecho propio es lo que hace que cosas como la ley de la torre E ( X ) = E [ E ( X Y ) ]E(X)=E[E(XY)] tengan sentido: el Lo que está dentro de las llaves es aleatorio, por lo que podemos preguntar cuál es su expectativa, mientras que no hay nada aleatorio sobre E ( X Y = y )E(XY=y) . En la mayoría de los casos, podríamos esperar calcular E ( X Y =y ) = x f X Y ( x y ) d x 

E(XY=y)=xfXY(xy) dx

y luego obtenga E ( X Y ) "conectando" la variable aleatoria Y en lugar de y en la expresión resultante. Como se insinuó en un comentario anterior, hay un poco de sutileza que puede arrastrarse con respecto a cómo se definen rigurosamente estas cosas y vincularlas de la manera adecuada. Esto tiende a suceder con probabilidad condicional, debido a algunos problemas técnicos con la teoría subyacente.E(XY)Yy

chico
fuente
8

Supongamos que XX e YY son variables aleatorias.

Sea y 0y0 un número real fijo , digamos y 0 = 1y0=1 . Entonces, E [ X Y = y 0 ] = E [ X Y = 1 ]E[XY=y0]=E[XY=1] es un número : es el valor condicional esperado de XX dado que YY tiene el valor 11 . Ahora, tenga en cuenta para algún otro número real fijo y 1y1 , digamos y 1 = 1.5y1=1.5 , E [ X Y = y 1 ] = E [ X Y = 1.5 ]E[XY=y1]=E[XY=1.5] sería el valor condicional esperado de XX dado Y = 1.5Y=1.5 (un número real). No hay razón para suponer que E [ X Y = 1.5 ]E[XY=1.5] y E [ X Y = 1 ]E[XY=1] tienen el mismo valor. Por lo tanto, también podemos considerar E [ X Y = y ]E[XY=y] como unafunción de valor real g ( y )g(y) que asigna números reales yy a números reales E [ X Y = y ]E[XY=y] . Tenga en cuenta que la afirmación en la pregunta del OP de que E [ X Y = y ]E[XY=y] es una función de xx es incorrecta: E [ X Y = y ]E[XY=y] es una función de valor real de yy .

On the other hand, E[XY]E[XY] is a random variable ZZ which happens to be a function of the random variable YY. Now, whenever we write Z=h(Y)Z=h(Y), what we mean is that whenever the random variable YY happens to have value yy, the random variable ZZ has value h(y)h(y). Whenever YY takes on value yy, the random variable Z=E[XY]Z=E[XY] takes on value E[XY=y]=g(y)E[XY=y]=g(y). Thus, E[XY]E[XY] is just another name for the random variable Z=g(Y)Z=g(Y). Note that E[XY]E[XY] is a function of YY (not yy as in the statement of the OP's question).

As a a simple illustrative example, suppose that XX and YY are discrete random variables with joint distribution P(X=0,Y=0)=0.1,  P(X=0,Y=1)=0.2,P(X=1,Y=0)=0.3,  P(X=1,Y=1)=0.4.

P(X=0,Y=0)P(X=1,Y=0)=0.1,  P(X=0,Y=1)=0.2,=0.3,  P(X=1,Y=1)=0.4.
Note that XX and YY are (dependent) Bernoulli random variables with parameters 0.70.7 and 0.60.6 respectively, and so E[X]=0.7E[X]=0.7 and E[Y]=0.6E[Y]=0.6. Now, note that conditioned on Y=0Y=0, XX is a Bernoulli random variable with parameter 0.750.75 while conditioned on Y=1Y=1, XX is a Bernoulli random variable with parameter 2323. If you cannot see why this is so immediately, just work out the details: for example P(X=1Y=0)=P(X=1,Y=0)P(Y=0)=0.30.4=34,P(X=0Y=0)=P(X=0,Y=0)P(Y=0)=0.10.4=14,
P(X=1Y=0)=P(X=1,Y=0)P(Y=0)=0.30.4=34,P(X=0Y=0)=P(X=0,Y=0)P(Y=0)=0.10.4=14,
and similarly for P(X=1Y=1)P(X=1Y=1) and P(X=0Y=1)P(X=0Y=1). Hence, we have that E[XY=0]=34,E[XY=1]=23.
E[XY=0]=34,E[XY=1]=23.
Thus, E[XY=y]=g(y)E[XY=y]=g(y) where g(y)g(y) is a real-valued function enjoying the properties: g(0)=34,g(1)=23.
g(0)=34,g(1)=23.

On the other hand, E[XY]=g(Y)E[XY]=g(Y) is a random variable that takes on values 3434 and 2323 with probabilities 0.4=P(Y=0)0.4=P(Y=0) and 0.6=P(Y=1)0.6=P(Y=1) respectively. Note that E[XY]E[XY] is a discrete random variable but is not a Bernoulli random variable.

As a final touch, note that E[Z]=E[E[XY]]=E[g(Y)]=0.4×34+0.6×23=0.7=E[X].

E[Z]=E[E[XY]]=E[g(Y)]=0.4×34+0.6×23=0.7=E[X].
That is, the expected value of this function of YY, which we computed using only the marginal distribution of YY, happens to have the same numerical value as E[X]E[X] !! This is an illustration of a more general result that many people believe is a LIE: E[E[XY]]=E[X].
E[E[XY]]=E[X].

Sorry, that's just a small joke. LIE is an acronym for Law of Iterated Expectation which is a perfectly valid result that everyone believes is the truth.

Dilip Sarwate
fuente
3

E(X|Y)E(X|Y) is the expectation of a random variable: the expectation of XX conditional on YY. E(X|Y=y)E(X|Y=y), on the other hand, is a particular value: the expected value of XX when Y=yY=y.

Think of it this way: let XX represent the caloric intake and YY represent height. E(X|Y)E(X|Y) is then the caloric intake, conditional on height - and in this case, E(X|Y=y)E(X|Y=y) represents our best guess at the caloric intake (XX) when a person has a certain height Y=yY=y, say, 180 centimeters.

abaumann
fuente
4
I believe your first sentence should replace "distribution" with "expectation" (twice).
Glen_b -Reinstate Monica
4
E(XY)E(XY) isn't the distribution of XX given YY; this would be more commonly denotes by the conditional density fXY(xy)fXY(xy) or conditional distribution function. E(XY)E(XY) is the conditional expectation of XX given YY, which is a YY-measurable random variable. E(XY=y)E(XY=y) might be thought of as the realization of the random variable E(XY)E(XY) when Y=yY=y is observed (but there is the possibility for measure-theoretic subtlety to creep in).
guy
1
@guy Your explanation is the first accurate answer yet provided (out of three offered so far). Would you consider posting it as an answer?
whuber
@whuber I would but I'm not sure how to strike the balance between accuracy and making the answer suitably useful to OP and I'm paranoid about getting tripped up on technicalities :)
guy
@Guy I think you have already done a good job with the technicalities. Since you are sensitive about communicating well with the OP (which is great!), consider offering a simple example to illustrate--maybe just a joint distribution with binary marginals.
whuber
1

E(X|Y) is expected value of values of X given values of Y E(X|Y=y) is expected value of X given the value of Y is y

Generally P(X|Y) is probability of values X given values Y, but you can get more precise and say P(X=x|Y=y), i.e. probability of value x from all X's given the y'th value of Y's. The difference is that in the first case it is about "values of" and in the second you consider a certain value.

You could find the diagram below helpful.

Bayes theorem diagram form Wikipedia

Tim
fuente
This answer discusses probability, while the question asks about expectation. What is the connection?
whuber