¿Por qué todas las distribuciones conocidas son unimodales?

13

No conozco ninguna distribución multimodal.

¿Por qué todas las distribuciones conocidas son unimodales? ¿Hay alguna distribución "famosa" que tenga más de un modo?

Por supuesto, las mezclas de distribuciones son a menudo multimodales, pero me gustaría saber si existen distribuciones "no mixtas" que tengan más de un modo.

distributions mode Miroslav Sabo
fuente

55

Estás hablando de distribuciones "estándar" en lugar de distribuciones "conocidas".

Stéphane Laurent

12

¿Qué tal beta con

α = β = 0.5

$\alpha=\beta=0.5$ ?

ameba dice Reinstate Monica

1

Si no le importa las distribuciones bimodales limitadas , Wikipedia también menciona la distribución U-cuadrática y arcoseno . Sin embargo, creo que estos son solo casos especiales de la distribución beta ... Wikipedia también menciona algunos ejemplos de ocurrencias naturales de distribuciones multimodales .

Nick Stauner

12

@ StéphaneLaurent: Me gustan las "distribuciones de marca" , porque transmitir que haber sido nombrado no implica en sí mismo ningún estado especial para una distribución. Las distribuciones "conocidas" hacen que parezca que el resto puede estar en algún lugar esperando ser descubierto, como el monstruo de Loch-Ness o la materia oscura.

Scortchi - Restablece a Monica

55

Excelente @Scortchi, excelente vocabulario! Muchos científicos no matemáticos que he encontrado tienen la impresión de que no existe una distribución sin nombre. Tal vez hay un hecho filosófico más profundo relacionado detrás de eso, la confusión de un nombre y de la cosa denotada por este nombre (como dijo Russell, "La palabra 'perro' no se parece a un perro")

Stéphane Laurent

17

La primera parte de la pregunta se responde en los comentarios a la pregunta: un montón de "marca" distribuciones son multimodales, tales como cualquiera Beta de distribución con y . Pasemos, entonces, a la segunda parte de la pregunta. $(a,b)$ $a\lt 1$ $b\lt 1$

Todas las distribuciones discretas son claramente mezclas (de átomos, que son unimodales).

Mostraré que la mayoría de las distribuciones continuas son también mezclas de distribuciones unimodales. La intuición detrás de esto es simple: podemos "eliminar" los golpes de un gráfico lleno de baches de un PDF, uno por uno, hasta que el gráfico esté horizontal. Los golpes se convierten en los componentes de la mezcla, cada uno de los cuales es obviamente unimodal.

En consecuencia, excepto quizás por algunas distribuciones inusuales cuyos archivos PDF son muy discontinuos, la respuesta a la pregunta es "ninguna": todas las distribuciones multimodales que son absolutamente continuas, discretas o una combinación de esas dos son mezclas de distribuciones unimodales.

Considere distribuciones continuas $F$ cuyos archivos PDF son continuos (estas son las distribuciones "absolutamente continuas"). (La continuidad no es una gran limitación; se puede relajar aún más mediante un análisis más cuidadoso, suponiendo simplemente que los puntos de discontinuidad son discretos). $f$

Para hacer frente a "mesetas" de valores constantes que pueden ocurrir, defina un "modo" para que sea un intervalo (que podría ser un único punto donde ) tal que $m = [x_l, x_u]$ $x_l=x_u$

tiene un valor constante en digamos . $f$ $m,$ $y$
no es constante en ningún intervalo que contenga estrictamente . $f$ $m$
Existe un número positivo tal que el valor máximo de alcanzado en es igual a . $\epsilon$ $f$ $[x_l-\epsilon, x_u+\epsilon]$ $y$

Sea cualquier modo de . Debido a que es continua, hay intervalos contienen para los cuales no disminuye en (que es un intervalo apropiado, no solo un punto) y no aumenta en $m = [x_l, x_u]$ $f$ $f$ $[x_l^\prime, x_u^\prime]$ $m$ $f$ $[x_l^\prime, x_l]$ $[x_u, x_u^\prime]$ (que también es un intervalo apropiado). Deje $x_l^\prime$ be the infinimum of all such values and $x_u^\prime$ the supremum of all such values.

Esta construcción ha definido una "joroba" en la gráfica de extiende desde hasta . Vamos ser el más grande de y . Por construcción, el conjunto de puntos en para los cuales es un intervalo apropiado $f$ $x_l^\prime$ $x_u^\prime$ $y$ $f(x_l^\prime)$ $f(x_u^\prime)$ $x$ $[x_l^\prime, x_u^\prime]$ $f(x)\ge y$ $m^\prime$ contiene estrictamente $m$ (because it contains either the whole of $[x_l^\prime, x_l]$ or $[x_u, x_u^\prime]$ ).

Figura

In this illustration of a multimodal PDF, a mode $m=[0,0]$ is identified by a red dot on the horizontal axis. The horizontal extent of the red portion of the fill is the interval $m^\prime$ : it is the base of the hump determined by the mode $m$ . The base of that hump is at height $y\approx 0.16$ . The original PDF is the sum of the red fill and the blue fill. Notice that the blue fill only has one mode near $2$ ; the original mode at $[0,0]$ has been removed.

Writing $|m^\prime|$ for the length of $m^\prime$ , define

p_{m} = {Pr}_{F} (m^{'}) - y | m^{'} |

$p_m = {\Pr}_F(m^\prime) - y|m^\prime|$

and

f_{m} (x) = \frac{f (x) - y}{p_{m}}

$f_m(x) = \frac{f(x) - y}{p_m}$

when $x \in m^\prime$ and $f_m(x)=0$ otherwise. (This makes $f_m$ a continuous function, incidentally.) The numerator is the amount by which $f$ rises above $y$ and the denominator $p_m$ is the area between the graph of $f$ and $y$ . Thus $f_m$ is non-negative and has total area $1$ : it is the PDF of a probability distribution. By construction it has a unique mode $m$ .

Also by construction, the function

f_{m}^{'} (x) = \frac{f (x) - p_{m} f_{m} (x)}{1 - p_{m}}

$f_m^\prime(x) = \frac{f(x) - p_mf_m(x)}{1 - p_m}$

$p_m\lt 1$ $p_m=1$ $f,$ $m^\prime$ (where it is constant, which is why the previous careful definition of a mode as an interval was necessary). Furthermore,

f (x) = p_{m} f_{m} (x) + (1 - p_{m}) f_{m}^{'} (x)

$f(x) = p_m f_m(x) + (1-p_m)f_m^\prime(x)$

is a mixture of the unimodal PDF $f_m$ and the PDF $f_m^\prime$ .

Iterate this procedure with $f_m^\prime$ (which as a linear combination of continuous functions is still a continuous function, enabling us to proceed as before), producing a sequence of modes $m=m_1, m_2, \ldots$ ; corresponding sequences of weights $p_1=p_m, p_2=p_{m_2}, \ldots$ ; and PDFs $f_1=f_m, f_2=f_{m_2}, \ldots.$ The limiting result exists because (a) the interval where $f_i$ is flattened includes a proper interval that had not been flattened in the preceding $i-1$ operations and (b) the real numbers cannot be decomposed into more than a countable number of such intervals. The limit cannot have any modes and therefore is constant, which must be zero (for otherwise its integral would diverge). Consequently, $f$ has been expressed (perhaps not uniquely, because the order in which modes were selected will matter) as a mixture

f (x) = \sum_{i} p_{i} f_{i} (x)

$f(x) = \sum_i p_i f_i(x)$

of unimodal distributions, QED.

whuber
fuente

7

By unimodal, I think the OP plainly means that there is just one interior mode (i.e. excluding corner solutions). The question is thus really asking ...

why is it that brand name distributions do NOT have more than one interior mode?

$\text{why is it that brand name distributions do NOT have more than one interior mode?}$

i.e. why do most brand name distributions look something like this:

... plus or minus some skewness or some discontinuities? When the question is posed thus, the Beta distribution would not be a valid counter example.

It appears the OP's conjecture has some validity: most common brand name distributions do not allow for more than one interior mode. There may be theoretical reasons for this. For example, any distribution that is a member of the Pearson family (which includes the Beta) will necessarily be (interior) unimodal, as a consequence of the parent differential eqn that defines the entire family. And the Pearson family nests most of the best-known brand names.

Nevertheless, here are some brand name counter examples ...

Counter example

One brand-name counter-example is the $\text{Sinc}^2$ distribution with pdf:

f (x) = \frac{\sin^{2} (x)}{π x^{2}}

$f(x)=\frac{\sin ^2(x)}{\pi x^2}$

defined on the real line. Here is a plot of the $\text{Sinc}^2$ pdf:

We could also perhaps add the family of cardiod and distributions related to this class ... with pdf plots such as:

The family of reflected brand name distributions would also perhaps be possible brand name contenders (though, these might be thought of as a 'cheat solution' ... but they are still brand names) such as the Reflected Weibull shown here:

wolfies
fuente

1

My, that plot of

{Sinc}^{2}

$\text{Sinc}^2$ sure looks like it has some negative values! (Could that be a plotting artifact?) ... And the cardioid distributions look like they have only one interior mode each.

whuber

1

Hi @whuber ... must agree re the plotting artefact (I will take that up on Mathematica SE !). Re cardiod family: idea is that one can extend the domain of such families as one please, and like a sine wave, it keeps on giving :)

wolfies

1

(+1) It is a strange artifact: your last plot (of reflected distributions) does not seem to exhibit it. You might trace the generation of plot points in the

{Sinc}^{2}

$\text{Sinc}^2$ plot to see where they lie; I suspect the slight negative values might be an overshooting of a spline of a small number of points.

whuber

I think it is just because the plotted line is thicker than the axis line, so appears to 'overshoot' the axis when close to zero. If the line is plotted thinner, the artefact disappears.

wolfies

But there is no such artifact in your bottom figure, which also has lines thicker than the axis.

whuber

3

That you mightn't think of any doesn't mean there aren't any.

I can name "known" distributions that aren't unimodal.

For example, a Beta distribution with $\alpha$ and $\beta$ both $<1$ .

http://en.wikipedia.org/wiki/Beta_distribution

also see

http://en.wikipedia.org/wiki/U-quadratic_distribution

(This isn't a special case of the beta distribution, in spite of the comment that says it is. The two families have some overlap, however.)

Mixture distributions are certainly known, and many of those are multimodal.

Glen_b -Reinstate Monica
fuente

The U-quadratic is a truncated Beta distribution.

becko

1

La distribución alfa-sesgada-normal (Elal-Olivero 2010) tiene un PDF:

\frac{{(1 - α \frac{x - μ}{σ})}^{2} + 1}{2 + α^{2}} φ (\frac{x - μ}{σ}),

$\frac{\left(1-\alpha\frac{x-\mu}{\sigma}\right)^2+1}{2+\alpha^2} \varphi\left(\frac{x-\mu}{\sigma}\right),$

where $\varphi$ is the PDF of a standard Gaussian.

For $|\alpha|>1.34$ the distribution is bimodal. Examplary plot for $\mu=1,\sigma=0.5,a=2$ :

corey979
fuente

¿Por qué todas las distribuciones conocidas son unimodales?

Respuestas: