Explicación de los grados de libertad no enteros en la prueba t con variaciones desiguales

15

El procedimiento de prueba t de SPSS informa 2 análisis cuando se comparan 2 medias independientes, un análisis con variaciones iguales asumidas y otro con variaciones iguales no asumidas. Los grados de libertad (df) cuando se asumen variaciones iguales son siempre valores enteros (e iguales n-2). El df cuando no se asumen variaciones iguales no son enteros (por ejemplo, 11.467) y no se acercan a n-2. Estoy buscando una explicación de la lógica y el método utilizado para calcular estos df no enteros.

Joel W.
fuente
3
Una presentación de PowerPoint de la Universidad de Florida contiene una buena explicación de cómo se deriva esta aproximación a la distribución de muestreo de la estadística t de Student para el caso de variaciones desiguales.
whuber
¿La prueba t de Welch siempre es más precisa? ¿Hay un inconveniente en usar el enfoque Welch?
Joel W.
Si el Welch y la prueba t original producen p's dramáticamente diferentes, ¿con qué debo ir? ¿Qué sucede si el valor p para las diferencias en las varianzas es solo .06, pero las diferencias en los valores de p de las dos pruebas t son .000 y .121? (Esto ocurrió cuando un grupo de 2 no tenía variación y el otro grupo de 25 tenía una variación de 70,000.)
Joel W.
2
No elija entre ellos en función del valor . A menos que tenga una buena razón (incluso antes de ver los datos) para asumir la misma varianza, simplemente no haga esa suposición. pag
Glen_b -Reinstate Monica
1
Todas las preguntas se refieren a cuándo usar la prueba de Welch. Esta pregunta ha sido publicada en stats.stackexchange.com/questions/116610/…
Joel W.

Respuestas:

11

Se puede demostrar que Welch-Satterthwaite df es una media armónica ponderada a escala de los dos grados de libertad, con pesos proporcionales a las desviaciones estándar correspondientes.

La expresión original dice:

νW=(s12norte1+s22norte2)2s14 4norte12ν1+s24 4norte22ν2

Tenga en cuenta que es la varianza estimada de la i ésima media muestral o el cuadrado del i -ésimo error estándar de la media . Sea r = r 1 / r 2 (la razón de las varianzas estimadas de las medias muestrales), entoncesryo=syo2/ /norteyoyothyor=r1/ /r2

νW=(r1+r2)2r12ν1+r22ν2=(r1+r2)2r12+r22r12+r22r12ν1+r22ν2=(r+1)2r2+1r12+r22r12ν1+r22ν2

El primer factor es , que aumenta de 1 en r = 0 a 2 en r = 1 y luego disminuye a 1 en r = ; es simétrico en log r .1+sech(log(r))1r=02r=11r=logr

El segundo factor es una media armónica ponderada :

H(x_)=i=1nwii=1nwixi.

wi=ri2

r1/r2ν1r1/r20ν2r1=r2s12=s22 you get the usual equal-variance t-test d.f., which is also the maximum possible value for νW.

--

With an equal-variance t-test, if the assumptions hold, the square of the denominator is a constant times a chi-square random variate.

The square of the denominator of the Welch t-test isn't (a constant times) a chi-square; however, it's often not too bad an approximation. A relevant discussion can be found here.

A more textbook-style derivation can be found here.

Glen_b -Reinstate Monica
fuente
1
Great insight about the harmonic mean, which is more appropriate than arithmetic mean for averaging ratios.
Felipe G. Nievinski el
10

What you are referring to is the Welch-Satterthwaite correction to the degrees of freedom. The t-test when the WS correction is applied is often called Welch's t-test. (Incidentally, this has nothing to do with SPSS, all statistical software will be able to conduct Welch's t-test, they just don't usually report both side by side by default, so you wouldn't necessarily be prompted to think about the issue.) The equation for the correction is very ugly, but can be seen on the Wikipedia page; unless you are very math savvy or a glutton for punishment, I don't recommend trying to work through it to understand the idea. From a loose conceptual standpoint however, the idea is relatively straightforward: the regular t-test assumes the variances are equal in the two groups. If they're not, then the test should not benefit from that assumption. Since the power of the t-test can be seen as a function of the residual degrees of freedom, one way to adjust for this is to 'shrink' the df somewhat. The appropriate df must be somewhere between the full df and the df of the smaller group. (As @Glen_b notes below, it depends on the relative sizes of s12/n1 vs s22/n2; if the larger n is associated with a sufficiently smaller variance, the combined df can be lower than the larger of the two df.) The WS correction finds the right proportion of way from the former to the latter to adjust the df. Then the test statistic is assessed against a t-distribution with that df.

gung - Reinstate Monica
fuente
For one t-test, SPSS reports the df as 26.608 but the n's for the two groups are 22 and 104. Are you sure about " The appropriate df must be somewhere between the full df and the df of the larger group"? (The standard deviations are 10.5 and 8.1 for the smaller and larger groups, respectively.)
Joel W.
2
It depends on the relative sizes of s12/n1 vs s22/n2. If the larger n is associated with a sufficiently larger variance, the combined d.f. can be lower than the larger of the two d.f. Note that the Welch t-test is only approximate, since the squared denominator is not actually a (scaled) chi-square random variate. However in practice it does quite well.
Glen_b -Reinstate Monica
I think I'll expand on the relationship between the relative sizes of the (si2/ni) and the Welch d.f. in an answer (since it won't fit in a comment).
Glen_b -Reinstate Monica
1
@Glen_b, I'm sure that will be of great value here.
gung - Reinstate Monica