¿Por qué

15

Supongo que

P(A|B)=P(A|B,C)P(C)+P(A|B,¬C)P(¬C)

es correcto, mientras que

P(A|B)=P(A|B,C)+P(A|B,¬C)

Es incorrecto.

Sin embargo, tengo una "intuición" sobre la última, es decir, considera la probabilidad P (A | B) al dividir dos casos (C o No C). ¿Por qué esta intuición está mal?

zell
fuente
44
Aquí hay un ejemplo simple para probar tus ecuaciones. Lanza dos monedas independientes y justas. Sea A el evento de que el primero salga cara arriba, B sea ​​el evento de que el segundo salga cara arriba, y C sea ​​el evento de que ambos salgan cara arriba. ¿Es correcta alguna de las ecuaciones que escribiste?
A. Rex
44
La ley de probabilidad total dice que si desea expresar una probabilidad incondicional como una suma de probabilidades condicionales, debe ponderar por el evento de condicionamiento: por ejemplo, P(A)=P(A|B)P(B)+P(A|B¯)P(B¯)
AdamO

Respuestas:

25

Supongamos, como un ejemplo contador fácil, que la probabilidad P(A) de A es 1 , independientemente del valor de C . Entonces, si tomamos la ecuación incorrecta , obtenemos:

P(A|B)=P(A|B,C)+P(A|B,¬C)=1+1=2

Eso obviamente no puede ser correcto, un probablemente no puede ser mayor que . Esto ayuda a construir la intuición de que debe asignar un peso a cada uno de los dos casos proporcional a la probabilidad de que ese caso sea , lo que resulta en la primera ecuación (correcta). .1


Eso te acerca a tu primera ecuación, pero los pesos no son completamente correctos. Ver el comentario de A. Rex para los pesos correctos.

Dennis Soemers
fuente
1
¿Deberían los pesos en la "primera ecuación (correcta)" ser y P ( ¬ C ) , o deberían ser P ( C B ) y P ( ¬ C B ) ? P(C)P(¬C)P(CB)P(¬CB)
A. Rex
@ A.Rex Ese es un buen punto, para una corrección total, creo que debería ser y P ( ¬ C | B ) . Todo (solo un término) en el lado izquierdo de la ecuación supone que se da B , por lo que sin suposiciones adicionales (como suponer que B y C son independientes entre sí), lo mismo debería ser el caso a la derecha mano P(C|B)P(¬C|B)BBC
derecha
Solo piense que A | B está 200% seguro de que sucederá.
Mark L. Stone
@ MarkL.Stone ¿Eso significa que siempre sucede dos veces? ;)
Restablecer Monica
9

La respuesta de Dennis tiene un gran contraejemplo, refutando la ecuación incorrecta. Esta respuesta busca explicar por qué la siguiente ecuación es correcta:

P(A|B)=P(A|C,B)P(C|B)+P(A|¬C,B)P(¬C|B).

As every term is conditioned on B, we can replace the entire probability space by B and drop the B term. This gives us:

P(A)=P(A|C)P(C)+P(A|¬C)P(¬C).

Then you are asking why this equation has the P(C) and P(¬C) terms in it.

The reason is that P(A|C)P(C) is the portion of A in C and P(A|¬C)P(¬C) is the portion of A in ¬C and the two add up to A. See diagram. On the other hand P(A|C) is the proportion of C containing A and es la proporción de ¬ C que contiene A : estas son proporciones de diferentes regiones, por lo que no tienen denominadores comunes, por lo que sumarlas no tiene sentido.P(A|¬C)¬CA

pic

Reinstalar a Mónica
fuente
2
No "todo está condicionado por ". En particular, P ( C ) y P ( ¬ C ) no son, por lo que no podemos dejar de lado B . ¡Además, esto podría sugerir que la ecuación está mal! BP(C)P(¬C)B
A. Rex
@ A.Rex Técnicamente tienes razón, debería haber dicho que cada término que involucra a está condicionado a B (hice una simple sustitución A | B A ). Corregiré la respuesta. ABA|BA
Restablecer Monica
55
My objection wasn't a technicality. Your diagram correctly proves that P(A)=P(AC)P(C)+P(A¬C)P(¬C), which after conditioning on B becomes P(AB)=P(AB,C)P(CB)+P(AB,¬C)P(¬CB); note that the probabilities of C and ¬C are also conditioned on B. This is not the first equation given in the OP, which is good news, because the first equation given in the OP is not correct.
A. Rex
@A.Rex You are right once again, C must also conditioned on B as the proportion of the probability space contained in C might not be the same as the proportion of B contained in C. This point escaped me. I will revise again.
Reinstate Monica
7

I know you've already received two great answers to your question, but I just wanted to point out how you can turn the idea behind your intuition into the correct equation.

First, remember that P(XY)=P(XY)P(Y) and equivalently P(XY)=P(XY)P(Y).

To avoid making mistakes, we will use the first equation in the previous paragraph to eliminate all conditional probabilities, then keep rewriting expressions involving intersections and unions of events, then use the second equation in the previous paragraph to re-introduce the conditionals at the end. Thus, we start with:

P(AB)=P(AB)P(B)

We will keep rewriting the right-hand side until we get the desired equation.

The casework in your intuition expands the event A into (AC)(A¬C), resulting in

P(AB)=P(((AC)(A¬C))B)P(B)

As with sets, the intersection distributes over the union:

P(AB)=P((ABC)(AB¬C))P(B)

Since the two events being unioned in the numerator are mutually exclusive (since C and ¬C cannot both happen), we can use the sum rule:

P(AB)=P(ABC)P(B)+P(AB¬C)P(B)

We now see that P(AB)=P(ACB)+P(A¬CB); thus, you can use the sum rule on the event on the event of interest (the "left" side of the conditional bar) if you keep the given event (the "right" side) the same. This can be used as a general rule for other equality proofs as well.

We re-introduce the desired conditionals using the second equation in the second paragraph:

P(A(BC))=P(ABC)P(BC)
and similarly for ¬C.

We plug this into our equation for P(AB) as:

P(AB)=P(ABC)P(BC)P(B)+P(AB¬C)P(B¬C)P(B)

Noting that P(BC)P(B)=P(CB) (and similarly for ¬C), we finally get

P(AB)=P(ABC)P(CB)+P(AB¬C)P(¬CB)

Which is the correct equation (albeit with slightly different notation), including the fix A. Rex pointed out.

Note that P(ACB) turned into P(ABC)P(CB). This mirrors the equation P(AC)=P(AC)P(C) by adding the B condition to not only P(AC) and P(AC), but also P(C) as well. I think if you are to use familiar rules on conditioned probabilities, you need to add the condition to all probabilities in the rule. And if there's any doubt whether that idea works for a particular situation, you can always expand out the conditionals to check, as I did for this answer.

YawarRaza7349
fuente
2
+1. I think you extracted the equation that OP tried to intuit: P(AB)=P(ACB)+P(A¬CB).
A. Rex
Thanks! That was the main point I wanted to make, but couldn't figure out a high-level explanation why the intersection goes on the left rather than the right, so I used formulas instead. Also, I just noticed you were the one who pointed out the mistake in OP's formula, so I credited you for that. (I probably wouldn't have noticed either, lol.)
YawarRaza7349
2

Probabilities are ratios; the probability of A given B is how often A happens within the space of B. For instance, P(rain|March) is the number of rainy days in March divided by the number of total days in March. When dealing with fractions, it makes sense to split up numerators. For instance,

P(rain or snow|March)=(number of rainy or snowy days in March)(total number of days in March)=(number of rainy days in March)(total number of days in March)+(number of snowy days in March)(total number of days in March)=P(rain|March)+P(snow|March)

This of course assumes that "snow" and "rain" are mutually exclusive. It does not, however, make sense to split up denominators. So if you have P(rain|February or March), that is equal to

(number of rainy days in February and March)(total number of days in February and March).

But that is not equal to

(number of rainy days in February)(total number of days in February)+(number of rainy days in March)(total number of days in March).

If you're having trouble seeing that, you can try out some numbers. Suppose there are 10 rainy days in February and 8 in March. Then we have

(number of rainy days in February and March)(total number of days in February and March)=(10+8)/(28+31)=29.5%

and

(number of rainy days in February)(total number of days in February)+(number of rainy days in March)(total number of days in March)=(10/28)+(8/31)=35.7%+25.8%=61.5%

The first number, 29.5%, is the average of 35.7% and 25.8% (with the second number weighted slightly more because there is are more days in March). When you say P(A|B)=P(A|B,C)+P(A|B,¬C) you're saying that x1+x2y1+y2=x1y1+x2y2, which is false.

Acccumulation
fuente
1

If I go to Spain, I can get sunburnt.

P(sunburnt|Spain)=0.2
This tells me nothing about getting sunburnt if not going to Spain, let's say
P(sunburnt|¬Spain)=0.1
This year I'm going to Spain, so
P(sunburnt)=0.2
Letting B=Ω, this is, P(B)=1, your intuition would imply
P(A)=P(A|C)+P(A|¬C)
which by the previous argument, isn't neccesarily true.
sheriff
fuente