En mi clase de probabilidad, los términos "sumas de variables aleatorias" se usan constantemente. Sin embargo, ¿estoy atascado en lo que eso significa exactamente?
¿Estamos hablando de la suma de un montón de realizaciones de una variable aleatoria? Si es así, ¿eso no se suma a un solo número? ¿Cómo una suma de realizaciones de variables aleatorias nos lleva a una distribución, o una función cdf / pdf / de cualquier tipo? Y si no se trata de realizaciones de variables aleatorias, ¿qué se agrega exactamente?
Respuestas:
A physical, intuitive model of a random variable is to write down the name of every member of a population on one or more slips of paper--"tickets"--and put those tickets into a box. The process of thoroughly mixing the contents of the box, followed by blindly pulling out one ticket--exactly as in a lottery--models randomness. Non-uniform probabilities are modeled by introducing variable numbers of tickets in the box: more tickets for the more probable members, fewer for the less probable.
A random variable is a number associated with each member of the population. (Therefore, for consistency, every ticket for a given member has to have the same number written on it.) Multiple random variables are modeled by reserving spaces on the tickets for more than one number. We usually give those spaces names likeX, Y, and Z . The sum of those random variables is the usual sum: reserve a new space on every ticket for the sum, read off the values of X, Y, etc. on each ticket, and write their sum in that new space. This is a consistent way of writing numbers on the tickets, so it's another random variable.
This figure portrays a box representing a populationΩ={α,β,γ} and three random variables X , Y , and X+Y . It contains six tickets: the three for α (blue) give it a probability of 3/6 , the two for β (yellow) give it a probability of 2/6 , and the one for γ (green) give it a probability of 1/6 . In order to display what is written on the tickets, they are shown before being mixed.
The beauty of this approach is that all the paradoxical parts of the question turn out to be correct:
the sum of random variables is indeed a single, definite number (for each member of the population),
yet it also leads to a distribution (given by the frequencies with which the sum appears in the box), and
it still effectively models a random process (because the tickets are still blindly drawn from the box).
In this fashion the sum can simultaneously have a definite value (given by the rules of addition as applied to numbers on each of the tickets) while the realization--which will be a ticket drawn from the box--does not have a value until it is carried out.
This physical model of drawing tickets from a box is adopted in the theoretical literature and made rigorous with the definitions of sample space (the population), sigma algebras (with their associated probability measures), and random variables as measurable functions defined on the sample space.
This account of random variables is elaborated, with realistic examples, at "What is meant by a random variable?".
fuente
there is no secret behind this phrase, it is as simple as you can think: if X and Y are two random variables, their sum is X + Y and this sum is a random variable as well. If X_1, X_2, X_3,...,X_n and are n random variables, their sum is X_1 + X_2 + X_3 +...+ X_n and this sum is also a random variable (and a realization of this sum is a single number, namely a sum of n realizations).
Why do you talk so much about sums of random variables in the class? One reason is the (amazing) central limit theorem: if we sum many independent random variables, than we can "predict" the distribution of this sum (almost) independently of the distribution of the single variables in the sum! The sum tends to become a normal distribution and this is the likely reason why we observe the normal distribution so often in the real world.
fuente
r.v. is a relation between the occurrence of an event and a real number. Say, if it's raining the value X is 1, if it's not then 0. You can have another r.v. Y equal to 10 when it's cold, and 100 when it's hot. So, if it's raining and cold then X=1, Y=10, and X+Y=11.
X+Y values are 10 (not raining cold); 11 (raining,cold), 100 (not raining,hot) and 110 (raining, hot). If you figure our probabilities of the events, then you'll get PMF of this new r.v. X+Y.
fuente
None of these answers gives a mathematically rigorous way to think about sum of random variable. Note thatX,Y needs not to be defined on the same outcome domain and even if they do, X+Y cannot be understood as summing up two functions. Rather, they should be first extended to the domain Ω1×Ω2 . For example, let X,Y be identical function of Ω={Head,Tail} where X(Head)=Y(Head)=1,X(Tail)=Y(Tail)=0 . Domain of (X+Y) should be {(Head,Tail),(Tail,Head),(Head, Head),(Tail,Tail)}. Now X,Y are functions on this product space where their value is determined solely by the 1st and 2nd coordinate respectively. The sum now can be understood as summation of functions as the usual sense. Note also that the σ− field and probability measure should also be defined anew. Saying X,Y are independent is one way to specify the product measure.
fuente