Construyendo un rv discreto teniendo como soporte todos los racionales en

19

Esta es la secuela constructivista de esta pregunta .

Si no podemos tener una variable aleatoria uniforme discreta que tenga como soporte todos los racionales en el intervalo [0,1] , la siguiente mejor opción es:

Construya una variable aleatoria Q que tenga este soporte, QQ[0,1] , y que siga alguna distribución. Y el artesano en mí requiere que esta variable aleatoria se construya a partir de distribuciones existentes, en lugar de crearse definiendo de manera abstracta lo que deseamos obtener.

Entonces se me ocurrió lo siguiente:

Sea X una variable aleatoria discreta que sigue la Distribución Geométrica-Variante II con el parámetro 0<p<1 , a saber

X{0,1,2,...},P(X=k)=(1p)kp,FX(X)=1(1p)k+1

Supongamos también que Y sea ​​una variable aleatoria discreta que sigue la Distribución Geométrica-Variante I con el mismo parámetro p , a saber

Y{1,2,...},P(Y=k)=(1p)k1p,FY(Y)=1(1p)k

X eY son independientes. Define ahora la variable aleatoria

Q=XY

y considerar la distribución condicional

P(Qq{XY})

En palabras sueltas, "condicional es la relación de X sobre Y condicional en que X sea ​​menor o igual que Y ". El apoyo de esta distribución condicional es { 0 , 1 , 1 / 2 , 1 / 3 , . . . , 1 / k , 1 / ( k + 1 ) , . . . , 2 / 3 , 2 / 4QXYXY .{0,1,1/2,1/3,...,1/k,1/(k+1),...,2/3,2/4,...}=Q[0,1]

La "pregunta" es: ¿Alguien puede proporcionar la función de masa de probabilidad condicional asociada?

Un comentario preguntaba "¿debería ser de forma cerrada"? Dado que lo que constituye una forma cerrada hoy en día no es tan claro, déjenme ponerlo de esta manera: estamos buscando una forma funcional en la que podamos ingresar un número racional de , y obtener la probabilidad (para algunos valor especificado del parámetro p, por supuesto), lo que lleva a un gráfico indicativo de la pmf. Y luego varíe p para ver cómo cambia el gráfico.[0,1]pp

Si ayuda, entonces podemos abrir uno o ambos límites del soporte, aunque estas variantes nos privarán de la capacidad de graficar definitivamente los valores superiores y / o inferiores del pmf . Además, si hacemos abierto el límite superior, entonces deberíamos considerar el evento de condicionamiento .{X<Y}

Alternativamente, también doy la bienvenida a otros RV que tienen este (s) soporte (s), siempre que se unan con su PMF .

Utilicé la distribución geométrica porque tiene disponibles dos variantes con una que no incluye cero en el soporte (de modo que se evita la división por cero). Obviamente, uno puede usar otros vehículos discretos, usando algún truncamiento.

Ciertamente pondré una recompensa por esta pregunta, pero el sistema no lo permite de inmediato.

Alecos Papadopoulos
fuente
1
¿Quieres decir ? (definir una variable aleatoria condicionalmente en algo no tiene sentido, solo se podría definir su distribución de esta manera)Q=XY1{XY}
Stéphane Laurent
1
Su Q es contable: usted sabe que existe una correspondencia 1-1 entre N = {1, 2, ...} y Q. Si pudiera encontrar dicha correspondencia, la solución sería elegir cualquier distribución sobre N y usarla elegir el elemento correspondiente de Q.
Adrian
de todos modos, debe calcular para cada fracción irreducible p / q y esto es Pr ( X = p , X = 2 p , ) × Pr ( Y = q , Y = 2 q , ... ) . Pr(X/Y=p/q)p/qPr(X=p,X=2p,)×Pr(Y=q,Y=2q,)
Stéphane Laurent
1
¿El requisito de proporcionar el pmf significa que se requiere un formulario cerrado? ¿O es, por ejemplo, la suma infinita de @ StéphaneLaurent suficiente para cumplir la condición?
Juho Kokkala
1
Deje e Y el RV en su publicación. P r [ Q = q ] = P r [ Y = f - 1 ( q ) ]f:NQ[0,1]Pr[Q=q]=Pr[Y=f1(q)]
Adrian

Respuestas:

19

Considere la distribución discreta con soporte en el conjunto { ( p ,F{(p,q)|qp1}N2 con masas de probabilidad

F(p,q)=321+p+q.

Esto se resume fácilmente (todas las series involucradas son geométricas) para demostrar que realmente es una distribución (la probabilidad total es la unidad).

For any nonzero rational number x let a/b=x be its representation in lowest terms: that is, b>0 and gcd(a,b)=1.

F induces a discrete distribution G on [0,1]Q via the rules

G(x)=G(ab)=n=1F(an,bn)=321+a+b2.

(and G(0)=0). Every rational number in (0,1] has nonzero probability. (If you must include 0 among the values with positive probability, just take some of the probability away from another number--like 1--and assign it to 0.)

To understand this construction, look at this depiction of F:

[Figure of F]

F gives probability masses at all points p,q with positive integral coordinates. Values of F are represented by the colored areas of circular symbols. The lines have slopes p/q for all possible combinations of coordinates p and q appearing in the plot. They are colored in the same way the circular symbols are: according to their slopes. Thus, slope (which clearly ranges from 0 through 1) and color correspond to the argument of G and the values of G are obtained by summing the areas of all circles lying on each line. For instance, G(1) is obtained by summing the areas of all the (red) circles along the main diagonal of slope 1, given by F(1,1)+F(2,2)+F(3,3)+ = 3/8+3/32+3/128+=1/2.

Figure

This figure shows an approximation to G achieved by limiting q100: it plots its values at 3044 rational numbers ranging from 1/100 through 1. The largest probability masses are 12,314,110,362,362,142,.

Here is the full CDF of G (accurate to the resolution of the image). The six numbers just listed give the sizes of the visible jumps, but every part of the CDF consists of jumps, without exception:

Figure 2

whuber
fuente
1
Thanks! I am in the process of understanding the construction. Just two questions: a) F is bivariate, but in the expression linking it to G it appears as univariate. Am I missing something? and b) Since G is univariate, I guess all the dots in the impressively looking first graph represent a different value on the horizontal axis (although of course this cannot be faithfully represented in such a scale), am I right?
Alecos Papadopoulos
I was just completing a figure that might address your comment, Alecos, and have added it to the answer. Note that I could have begun with any discrete distribution F and constructed G in the same way; this particular distribution was chosen to make the calculations easy.
whuber
Gets better and better, As for my first question in the previous comment, should it be F(ab,n) instead of F(abn)? I.e. that p=a/b and q=n?
Alecos Papadopoulos
This is a better answer than mine! I noticed two little things: I think your F(p, q) sums to 4 as written. Also in the equation below "F induces a discrete distribution G" you should have F(n a, n b) no?
Adrian
@Adrian, Alecos Thanks for catching those typos: the 1 should be a 1 and the notation for F obviously is incorrect. I'll fix them right away.
whuber
8

I'll put my comments together and post them as an answer just for clarity. I expect you won't be very satisfied, however, as all I do is reduce your problem to another problem.

My notation:

Q is a RV whose support is Q[0,1] -- my Q is not the same as the Q the OP constructs from his XY. We'll define this Q using Y and f, which I introduce below.

Y is any RV whose support is N{1,2,} -- the Y given by the OP would work, for example.

f is any one-to-one correspondence f:NQ[0,1] and f1 is its inverse. We know these exist.

Now I claim I can reduce your problem to just finding an f and its f1:

Just let Q=f(Y) and you are done. The PMF of Q is Pr[Q=q]=Pr[Y=f1(q)].

Edit:

Here is a function g that plays the role of f, despite not being a one-to-one correspondence (because of duplicates):

g <- function(y) {
    y <- as.integer(y)
    stopifnot(y >= 1)
    b <- 0
    a <- 0
    for (unused_index in seq(1, y)) {
        if (a >= b) {
            b <- b+1
            a <- 0
        } else {
            a <- a+1
        }
    }
    return(sprintf("q = %s / %s", a, b))
    ## return(a / b)
}
Adrian
fuente
(+1) No, I consider your approach an excellent example of how one can think and use the abstract approach in order to arrive at very applicable results and algortihms. The main point as I now understand it, is that one can obtain the desired construction by using as a functional form the pmf of any discrete distribution having support N{1,2,}. Of course it remains to find f and f1. Since you have a better understanding of this approach than I do, is the phrase "we know these exist" a polite way to say "but we have no idea how they look like"?:)
Alecos Papadopoulos
See jcu.edu/math/vignettes/infinity.htm: you could use a similar "diagonal pattern". The difficult part is getting an expression for f1. I'm not sure how to do that, but you could ask on math.stackexchange.com (or do some more googling first).
Adrian
In the link you provided it says at some point: "Note that it is not necessary to find a formula for the correspondence; all that is necessary is the certainty that such a correspondence exists. There are many other instances in mathematics that are like this--where the point is to show that something has to happen or that something exists, rather than to actually exhibit a formula." Well, the point in my question is to actually exhibit a formula : I called this question "constructivist" for a reason.
Alecos Papadopoulos
1
I think I can provide an algorithm that would work -- I'll think about it a bit more.
Adrian
I posted something -- lets you simulate Q, but doesn't solve the PMF issue.
Adrian