¿Cuál es la distribución de

17

Tengo cuatro variables independientes uniformemente distribuidas $a,b,c,d$ , cada una en $[0,1]$ . Quiero calcular la distribución de $(a-d)^2+4bc$ . Calculé la distribución de $u_2=4bc$ para ser

f_{2} (u_{2}) = - \frac{1}{4} \ln \frac{u_{2}}{4}

$f_2(u_2)=-\frac{1}{4}\ln\frac{u_2}{4}$ (por lo tanto,

u_{2} \in (0, 4]

$u_2\in(0,4]$ ), y de

u_{1} = (a - d)^{2}

$u_1=(a-d)^2$ para ser

f_{1} (u_{1}) = \frac{1 - \sqrt{u_{1}}}{\sqrt{u_{1}}} .

$f_1(u_1)=\frac{1-\sqrt{u_1}}{\sqrt{u_1}}.$ Ahora, la distribución de una suma

u_{1} + u_{2}

$u_1+u_2$ es (

u_{1}, u_{2}

$u_1,\, u_2$ también son independientes)

f_{u_{1} + u_{2}} (x) = \int_{- \infty}^{+ \infty} f_{1} (x - y) f_{2} (y) d y = - \frac{1}{4} \int_{0}^{4} \frac{1 - \sqrt{x - y}}{\sqrt{x - y}} \cdot \ln \frac{y}{4} d y,

$f_{u_1+u_2}(x)=\int_{-\infty}^{+\infty}f_1(x-y)f_2(y)dy=-\frac{1}{4}\int_0^4\frac{1-\sqrt{x-y}}{\sqrt{x-y}}\cdot\ln\frac{y}{4}dy,$ porque

y \in (0, 4]

$y\in(0,4]$ . Aquí, debe ser

x > y

$x>y$ para que la integral sea igual a

f_{u_{1} + u_{2}} (x) = - \frac{1}{4} \int_{0}^{x} \frac{1 - \sqrt{x - y}}{\sqrt{x - y}} \cdot \ln \frac{y}{4} d y .

$f_{u_1+u_2}(x)=-\frac{1}{4}\int_0^{x}\frac{1-\sqrt{x-y}}{\sqrt{x-y}}\cdot\ln\frac{y}{4}dy.$ Ahora lo inserto en Mathematica y obtengo que

f_{u_{1} + u_{2}} (x) = \frac{1}{4} [- x + x \ln \frac{x}{4} - 2 \sqrt{x} (- 2 + \ln x)] .

$f_{u_1+u_2}(x)=\frac{1}{4}\left[-x+x\ln\frac{x}{4}-2\sqrt{x}\left(-2+\ln x\right)\right].$

Hice cuatro conjuntos independientes constaban de números cada uno y dibujé un histograma de : $a,b,c,d$ $10^6$ $(a-d)^2+4bc$

ingrese la descripción de la imagen aquí

y dibujó una gráfica de : $f_{u_1+u_2}(x)$

ingrese la descripción de la imagen aquí

Generalmente, la gráfica es similar al histograma, pero en el intervalo mayor parte es negativa (la raíz está en 2.27034). Y la integral de la parte positiva es . $(0,5)$ $\approx 0.77$

¿Dónde está el error? ¿O dónde me estoy perdiendo algo?

EDITAR: Escalé el histograma para mostrar el PDF.

ingrese la descripción de la imagen aquí

EDIT 2: Creo que sé dónde está el problema en mi razonamiento, en los límites de integración. Debido y , no puedo simplemente El gráfico muestra la región tengo que integrar en:. $y\in (0,4]$ $x-y\in(0,1]$ $\int_0^x$

ingrese la descripción de la imagen aquí

Esto significa que tengo para (es por eso que parte de mi era correcta), en y en Desafortunadamente, Mathematica no puede calcular las dos últimas integrales (bueno, calcula la segunda, porque hay una unidad imaginaria en la salida que estropea todo ...). $\int_0^x$ $y\in(0,1]$ $f$ $\int_{x-1}^x$ $y\in(1,4]$ $\int_{x-1}^4$ $y\in (4,5]$

EDITAR 3: Parece que Mathematica PUEDE calcular las últimas tres integrales con el siguiente código:

(1/4)*Integrate[((1-Sqrt[u1-u2])*Log[4/u2])/Sqrt[u1-u2],{u2,0,u1}, Assumptions ->0 <= u2 <= u1 && u1 > 0]

(1/4)*Integrate[((1-Sqrt[u1-u2])*Log[4/u2])/Sqrt[u1-u2],{u2,u1-1,u1}, Assumptions -> 1 <= u2 <= 3 && u1 > 0]

(1/4)*Integrate[((1-Sqrt[u1-u2])*Log[4/u2])/Sqrt[u1-u2],{u2,u1-1,4}, Assumptions -> 4 <= u2 <= 4 && u1 > 0]

que da una respuesta correcta :)

distributions random-variable pdf uniform mathematica corey979
fuente

2

Me gusta que hayas intentado verificar la razonabilidad de tu respuesta por simulación. Su problema es que sabe que ha cometido un error, pero no puede ver exactamente dónde. ¿Ha considerado que puede verificar cada etapa de su método para solucionar dónde se encuentra el error? Por ejemplo, ¿el error reside en tu

? Bueno, puede verificar su PDF calculado con los resultados simulados, tal como lo hizo para su respuesta final. Lo mismo para

. Si

y

son correctas, cometió el error al combinarlas. ¡Tal verificación paso a paso le permite señalar dónde se equivocó!

f_{1} (u_{1})

$f_1(u_1)$

f_{2}

$f_2$

f_{1}

$f_1$

f_{2}

$f_2$

Silverfish

Tiré mi primer intento y lo volví a calcular desde cero. Creo que

y

son correctas, aunque tuve que multiplicar manualmente mi

inicial por 2 para normalizarla a la unidad. Pero eso solo cambia la altura y no explica por qué tengo

negativo .

f_{1}

$f_1$

f_{2}

$f_2$

f_{1}

$f_1$

f

$f$

corey979

Al generar tales histogramas para compararlos con las cantidades algebraicas calculadas, escale el histograma para que sea una densidad válida (y superpongalos si puede). Haga una verificación similar para su f1 y f2 para asegurarse de que tiene los correctos; si tienen razón (no vi ninguna buena razón para sospechar de ellos todavía, pero es mejor verificarlo dos veces), entonces el problema debe ser posterior.

Glen_b -Reinstalar Monica

19

A menudo ayuda a usar funciones de distribución acumulativa.

Primero,

F (x) = Pr ((a - d)^{2} \leq x) = Pr (| a - d | \leq \sqrt{x}) = 1 - (1 - \sqrt{x})^{2} = 2 \sqrt{x} - x .

$F(x) = \Pr((a-d)^2 \le x) = \Pr(|a-d| \le \sqrt{x}) = 1 - (1-\sqrt{x})^2 = 2\sqrt{x} - x.$

Próximo,

G (y) = Pr (4 b c \leq y) = Pr (b c \leq \frac{y}{4}) = \int_{0}^{y / 4} d t + \int_{y / 4}^{1} \frac{y d t}{4 t} = \frac{y}{4} (1 - \log (\frac{y}{4})) .

$G(y) = \Pr(4 b c \le y) = \Pr(b c \le \frac{y}{4}) = \int_0^{y/4} dt + \int_{y/4}^1\frac{y\,dt}{4t} = \frac{y}{4}\left(1 - \log\left(\frac{y}{4}\right)\right).$

Let $\delta$ range between the smallest ( $0$ ) and largest ( $5$ ) possible values of $(a-d)^2 + 4 b c$ . Writing $x=(a-d)^2$ with CDF $F$ and $y=4 b c$ with PDF $g = G^\prime$ , we need to compute

H (δ) = Pr ((a - d)^{2} + 4 b c \leq δ) = Pr (x \leq δ - y) = \int_{0}^{4} F (δ - y) g (y) d y .

$H(\delta) = \Pr((a-d)^2 + 4 b c \le \delta) = \Pr(x\le \delta-y) = \int_0^4 F(\delta-y)g(y)dy.$

We can expect this to be nasty--the uniform distribution PDF is discontinuous and thus ought to produce breaks in the definition of $H$ --so it is somewhat amazing that Mathematica obtains a closed form (which I will not reproduce here). Differentiating it with respect to $\delta$ gives the desired density. It is defined piecewise within three intervals. In $0 \lt \delta \lt 1$ ,

H^{'} (δ) = h (δ) = \frac{1}{8} (8 \sqrt{δ} + δ (- (2 + \log (16))) + 2 (δ - 2 \sqrt{δ}) \log (δ)) .

$H^\prime(\delta) = h(\delta) = \frac{1}{8} \left(8 \sqrt{\delta }+\delta (-(2+\log (16)))+2 \left(\delta -2 \sqrt{\delta }\right) \log (\delta )\right).$

In $1 \lt \delta \lt 4$ ,

h (δ) = \frac{1}{4} (- (δ + 1) \log (δ - 1) + δ \log (δ) - 4 \sqrt{δ} \coth^{- 1} (\sqrt{δ}) + 3 + \log (4)) .

$h(\delta) = \frac{1}{4} \left(-(\delta +1) \log (\delta -1)+\delta \log (\delta )-4 \sqrt{\delta } \coth ^{-1}\left(\sqrt{\delta }\right)+3+\log (4)\right).$

And in $4 \lt \delta \lt 5$ ,

\begin{aligned} h (δ) = \\ \frac{1}{4} (δ - 4 \sqrt{δ - 4} + (δ + 1) \log (\frac{4}{δ - 1}) + 4 \sqrt{δ} \tanh^{- 1} (\frac{\sqrt{(δ - 4) δ} - \sqrt{δ}}{δ - \sqrt{δ - 4}}) - 1) . \end{aligned}

$\eqalign{ &h(\delta) = \\ &\frac{1}{4}\left(\delta -4 \sqrt{\delta -4}+(\delta +1) \log \left(\frac{4}{\delta -1}\right)+4 \sqrt{\delta } \tanh ^{-1}\left(\frac{\sqrt{(\delta -4) \delta }-\sqrt{\delta }}{\delta -\sqrt{\delta -4}}\right)-1\right). }$

This figure overlays a plot of $h$ on a histogram of $10^6$ iid realizations of $(a-d)^2 + 4bc$ . The two are almost indistinguishable, suggesting the correctness of the formula for $h$ .

The following is a nearly mindless, brute-force Mathematica solution. It automates practically everything about the calculation. For instance, it will even compute the range of the resulting variable:

ClearAll[ a, b, c, d, ff, gg, hh, g, h, x, y, z, zMin, zMax, assumptions];
assumptions = 0 <= a <= 1 && 0 <= b <= 1 && 0 <= c <= 1 && 0 <= d <= 1; 
zMax = First@Maximize[{(a - d)^2 + 4 b c, assumptions}, {a, b, c, d}];
zMin = First@Minimize[{(a - d)^2 + 4 b c, assumptions}, {a, b, c, d}];

Here is all the integration and differentiation. (Be patient; computing $H$ takes a couple of minutes.)

ff[x_] := Evaluate@FullSimplify@Integrate[Boole[(a - d)^2 <= x], {a, 0, 1}, {d, 0, 1}];
gg[y_] := Evaluate@FullSimplify@Integrate[Boole[4 b c <= y], {b, 0, 1}, {c, 0, 1}];
g[y_]  := Evaluate@FullSimplify@D[gg[y], y];
hh[z_] := Evaluate@FullSimplify@Integrate[ff[-y + z] g[y], {y, 0, 4}, 
          Assumptions -> zMin <= z <= zMax];
h[z_]  :=  Evaluate@FullSimplify@D[hh[z], z];

Finally, a simulation and comparison to the graph of $h$ :

x = RandomReal[{0, 1}, {4, 10^6}];
x = (x[[1, All]] - x[[4, All]])^2 + 4 x[[2, All]] x[[3, All]];
Show[Histogram[x, {.1}, "PDF"], 
 Plot[h[z], {z, zMin, zMax}, Exclusions -> {1, 4}], 
 AxesLabel -> {"\[Delta]", "Density"}, BaseStyle -> Medium, 
 Ticks -> {{{0, "0"}, {1, "1"}, {4, "4"}, {5, "5"}}, Automatic}]

whuber
fuente

8

(+1), especially for reminding people that, instead say of density convolutions, "Often it helps to use cumulative distribution functions" -especially when they have such a simple form as here. And you were damn quick, also.

Alecos Papadopoulos

That looks like a neat solution that I'd love to accept - right after I understand it. I'm more a calculus man than a probabilist; at this moment I have three questions: i) how did you use the CDF to get

F (x)

$F(x)$ and

G (y)

$G(y)$ , ii) why there's

F

$F$ and

g

$g$ under the integral for

H

$H$ , and iii) how do you from its form that the solution result will be piecewise?

corey979

(1)

F

$F$ and

G

$G$ are the CDFs. They are computed from the definition of a CDF, as indicated by the first equalities following their first appearances. The details should be apparent in the code I have inserted. (2) This is the convolution formula for a sum (more fully explained in a similar calculation at stats.stackexchange.com/a/144237). (3) I inserted a link to another thread about properties of uniform distributions.

whuber

7

Like the OP and whuber, I would use independence to break this up into simpler problems:

Let $X = (a-d)^2$ . Then the pdf of $X$ , say $f(x)$ is:

Let $Y = 4 b c$ . Then the pdf of $Y$ , say $g(y)$ is:

The problem reduces to now finding the pdf of $X + Y$ . There may be many ways of doing this, but the simplest for me is to use a function called TransformSum from the current developmental version of mathStatica. Unfortunately, this is not available in a public release at the present time, but here is the input:

TransformSum[{f,g}, z]

which returns the pdf of $Z = X + Y$ as the piecewise function:

Here is a plot of the pdf just derived, say $h(z)$ :

Quick Monte Carlo check

The following diagram compares an empirical Monte Carlo approximation of the pdf (squiggly blue) to the theoretical pdf derived above (red dashed). Looks fine.

wolfies
fuente

¿Cuál es la distribución de

Respuestas: