Comparación de la varianza de observaciones pareadas

16

Tengo observaciones emparejadas ( , ) extraídas de una distribución desconocida común, que tiene un primer y segundo momentos finitos, y es simétrica alrededor de la media.NXiYi

Sea la desviación estándar de (incondicional en ), y lo mismo para Y. Me gustaría probar la hipótesis σXXYσY

H0: σX=σY

H1: σXσY

Does anyone know of such a test? I can assume in first analysis that the distribution is normal, although the general case is more interesting. I am looking for a closed-form solution. Bootstrap is always a last resort.

gappy
fuente
3
I'm not sure why the information that the observations are paired is important to the hypothesis being tested; could you explain?
russellpierce
1
@drknexus it is important because the dependence makes the calibration of Fisher test difficult.
robin girard

Respuestas:

4

You could use the fact that the distribution of the sample variance is a chi square distribution centered at the true variance. Under your null hypothesis, your test statistic would be the difference of two chi squared random variates centered at the same unknown true variance. I do not know whether the difference of two chi-squared random variates is an identifiable distribution but the above may help you to some extent.


fuente
3
@svadali it is is more usual to use ratio here since the distribution of ratio of chi square is tabulated (Fisher's F). However, the problematic part of the question (i.e. the dependency between X and Y) is still there whatever you use. It is not straightforward to build up a test with two dependent chi square... I tryed to give an answer with a solution on that point (see below).
robin girard
7

If you want to go down the non-parametric route you could always try the squared ranks test.

For the unpaired case, the assumptions for this test (taken from here) are:

  1. Both samples are random samples from their respective populations.
  2. In addition to independence within each sample there is mutual independence between the two samples.
  3. The measurement scale is at least interval.

These lecture notes describe the unpaired case in detail.

For the paired case you will have to change this procedure slightly. Midway down this page should give you an idea of where to start.

csgillespie
fuente
6

The most naive approach I can think of is to regress Yi vs Xi as Yim^Xi+b^, then perform a t-test on the hypothesis m=1. See t-test for regression slope.

A less naive approach is the Morgan-Pitman test. Let Ui=XiYi,Vi=Xi+Yi, then perform a test of the Pearson Correlation coefficient of Ui vs Vi. (One can do this simply using the Fisher R-Z transform, which gives the confidence intervals around the sample Pearson coefficient, or via a bootstrap.)

If you are using R, and don't want to have to code everything yourself, I would use bootdpci from Wilcox' Robust Stats package, WRS. (see Wilcox' page.)

shabbychef
fuente
4

If you can assume bivariate normality, then you can develop a likelihood-ratio test comparing the two possible covariance matrix structures. The unconstrained (H_a) maximum likelihood estimates are well known - just the sample covariance matrix, the constrained ones (H_0) can be derived by writing out the likelihood (and will probably be some sort of "pooled" estimate).

If you don't want to derive the formulas, you can use SAS or R to fit a repeated measures model with unstructured and compound symmetry covariance structures and compare the likelihoods.

Aniko
fuente
3

The difficulty clearly comes because X and Y are corellated (I assume (X,Y) is jointly gaussian, as Aniko) and you can't make a difference (as in @svadali's answer) or a ratio (as in Standard Fisher-Snedecor "F-test") because those would be of dependent χ2 distribution, and because you don't know what this dependence is which make it difficult to derive the distribution under H0.

My answer relies on Equation (1) below. Because the difference in variance can be factorized with a difference in eigenvalues and a difference in rotation angle the test of equality can be declined into two tests. I show that it is possible to use the Fisher-Snedecor Test together with a test on the slope such as the one suggested by @shabbychef because of a simple property of 2D gaussian vectors.

Fisher-Snedecor Test: If for i=1,2 (Z1i,,Znii) iid gaussian random variables with empirical unbiased variance λ^i2 and true variance λi2, then it is possible to test if λ1=λ2 using the fact that, under the null,

It uses the fact that

R=λ^X2λ^Y2
follows a Fisher-Snedecor distribution F(n11,n21)

A simple property of 2D gaussian vector Let us denote by

R(θ)=[cosθsinθsinθcosθ]
It is clear that there exists λ1,λ2>0 ϵ1, ϵ2 two independent gaussian N(0,λi2) such that

[XY]=R(θ)[ϵ1ϵ2]
and that we have
Var(X)Var(Y)=(λ12λ22)(cos2θsin2θ)[1]

Testing of Var(X)=Var(Y) can be done through testing if ( λ12=λ22 or θ=π/4mod[π/2])

Conclusion (Answer to the question) Testing for λ12=λ22 is easely done by using ACP (to decorrelate) and Fisher Scnedecor test. Testing θ=π/4[modπ/2] is done by testing if |β1|=1 in the linear regression Y=β1X+σϵ (I assume Y and X are centered).

Testing wether (λ12=λ22 or θ=π/4[modπ/2]) at level α is done by testing if λ12=λ22 at level α/3 or if |β1|=1 at level α/3.

robin girard
fuente