Expectativa condicional de R-cuadrado

Considere el modelo lineal simple:

y y = X^{'} β β + ϵ

$\pmb{y}=X'\pmb{\beta}+\epsilon$

donde $\epsilon_i\sim\mathrm{i.i.d.}\;\mathcal{N}(0,\sigma^2)$ y $X\in\mathbb{R}^{n\times p}$ , $p\geq2$ y $X$ contiene una columna de constantes.

Mi pregunta es, dado $\mathrm{E}(X'X)$ , $\beta$ y $\sigma$ , ¿hay una fórmula para un límite superior no trivial en $\mathrm{E}(R^2)$ *? (suponiendo que el modelo fue estimado por OLS).

* Supuse, escribiendo esto, que obtener $E(R^2)$ sí mismo no sería posible.

EDITAR1

usando la solución derivada de Stéphane Laurent (ver abajo) podemos obtener un límite superior no trivial en $E(R^2)$ . Algunas simulaciones numéricas (a continuación) muestran que este límite es bastante estricto.

Stéphane Laurent dedujo lo siguiente: $R^2\sim\mathrm{B}(p-1,n-p,\lambda)$ donde $\mathrm{B}(p-1,n-p,\lambda)$ es una distribución Beta no central con parámetro de no centralidad $\lambda$ con

λ = \frac{| | X^{'} β - E (X)^{'} β 1_{n} | |^{2}}{σ^{2}}

$\lambda=\frac{||X'\beta-\mathrm{E}(X)'\beta1_n||^2}{\sigma^2}$

Entonces

E (R^{2}) = E (\frac{χ_{p - 1}^{2} (λ)}{χ_{p - 1}^{2} (λ) + χ_{n - p}^{2}}) \geq \frac{E (χ_{p - 1}^{2} (λ))}{E (χ_{p - 1}^{2} (λ)) + E (χ_{n - p}^{2})}

$\mathrm{E}(R^2)=\mathrm{E}\left(\frac{\chi^2_{p-1}(\lambda)}{\chi^2_{p-1}(\lambda)+\chi^2_{n-p}}\right)\geq\frac{\mathrm{E}\left(\chi^2_{p-1}(\lambda)\right)}{\mathrm{E}\left(\chi^2_{p-1}(\lambda)\right)+\mathrm{E}\left(\chi^2_{n-p}\right)}$

donde es un no central con el parámetro y grados de libertad. Entonces, un límite superior no trivial para es $\chi^2_{k}(\lambda)$ $\chi^2$ $\lambda$ $k$ $\mathrm{E}(R^2)$

\frac{λ + p - 1}{λ + n - 1}

$\frac{\lambda+p-1}{\lambda+n-1}$

es muy ajustado (mucho más apretado de lo que esperaba que fuera posible):

por ejemplo, usando:

rho<-0.75
p<-10
n<-25*p
Su<-matrix(rho,p-1,p-1)
diag(Su)<-1
su<-1
set.seed(123)
bet<-runif(p)

la media de sobre 1000 simulaciones es . El límite superior teórico anterior da . El límite parece ser igualmente preciso en muchos valores de $R^2$ 0.9608190.9609081 . Realmente asombroso! $R^2$

EDIT2:

Después de más investigaciones, parece que la calidad de la aproximación del límite superior a mejorará a medida que aumente (y todo lo demás igual, aumenta con ). $E(R^2)$ $\lambda+p$ $\lambda$ $n$

linear-model expected-value usuario603
fuente

tiene una distribución Beta con parámetros que dependen solo de

. No ?

R^{2}

$R^2$

n

$n$

p

$p$

Stéphane Laurent

Oooppss lo siento, mi afirmación anterior es cierta solo bajo la hipótesis del "modelo nulo" (solo intercepción). De lo contrario, la distribución de

debería ser algo así como una distribución Beta no central, con un parámetro de no centralidad que involucra los parámetros desconocidos.

R^{2}

$R^2$

Stéphane Laurent

@ StéphaneLaurent: gracias. ¿Sabrías más sobre la relación entre los parámetros desconocidos y los parámetros de la Beta? Estoy atascado, por lo que cualquier puntero sería bienvenido ...

user603

¿Es absolutamente necesario tratar con

? Quizás haya una fórmula exacta simple para

E [R^{2}]

$E[R^2]$

E [R^{2} / (1 - R^{2})]

$E[R^2/(1-R^2)]$

Stéphane Laurent

Con las anotaciones de mi respuesta,

para algún escalar

y el primer momento de la distribución

no central es simple.

R^{2} / (1 - R^{2}) = k F

$R^2/(1-R^2) = k F$

k

$k$

F

$F$

Stéphane Laurent

Respuestas:

Cualquier modelo lineal puede escribirse donde tiene la distribución normal estándar en y se supone que pertenece a un subespacio lineal de . En su caso . $\boxed{Y=\mu+\sigma G}$ $G$ $\mathbb{R}^n$ $\mu$ $W$ $\mathbb{R}^n$ $W=\text{Im}(X)$

Sea el subespacio lineal unidimensional generado por el vector . Tomando continuación, el está altamente relacionado con el estadístico clásico de Fisher $[1] \subset W$ $(1,1,\ldots,1)$ $U=[1]$ $R^2$ para la prueba de hipótesis dedondees un subespacio lineal, y denotando por el complemento ortogonal deen, y denotandoy

F = \frac{{‖ P_{Z} Y ‖}^{2} / (m - ℓ)}{{‖ P_{W}^{⊥} Y ‖}^{2} / (n - m)},

$F = \frac{{\Vert P_Z Y\Vert}^2/(m-\ell)}{{\Vert P_W^\perp Y\Vert}^2/(n-m)},$

H_{0} : {μ \in U}

$H_0\colon\{\mu \in U\}$

U \subset W

$U\subset W$

Z = U^{⊥} \cap W

$Z=U^\perp \cap W$

U

$U$

W

$W$

m = \dim (W)

$m=\dim(W)$

ℓ = \dim (U)

$\ell=\dim(U)$ (entonces

en su situación).

m = p

$m=p$

ℓ = 1

$\ell=1$

De hecho, porque la definición dees

\frac{{‖ P_{Z} Y ‖}^{2}}{{‖ P_{W}^{⊥} Y ‖}^{2}} = \frac{R^{2}}{1 - R^{2}}

$\dfrac{{\Vert P_Z Y\Vert}^2}{{\Vert P_W^\perp Y\Vert}^2} = \frac{R^2}{1-R^2}$

R^{2}

$R^2$

R^{2} = \frac{{‖ P_{Z} Y ‖}^{2}}{{‖ P_{U}^{⊥} Y ‖}^{2}} = 1 - \frac{{‖ P_{W}^{⊥} Y ‖}^{2}}{{‖ P_{U}^{⊥} Y ‖}^{2}} .

$R^2 = \frac{{\Vert P_Z Y\Vert}^2}{{\Vert P_U^\perp Y\Vert}^2}=1 - \frac{{\Vert P^\perp_W Y\Vert}^2}{{\Vert P_U^\perp Y\Vert}^2}.$

Obviously $\boxed{P_Z Y = P_Z \mu + \sigma P_Z G}$ and $\boxed{P_W^\perp Y = \sigma P_W^\perp G}$ .

When $H_0\colon\{\mu \in U\}$ is true then $P_Z \mu = 0$ and therefore

F = \frac{{‖ P_{Z} G ‖}^{2} / (m - ℓ)}{{‖ P_{W}^{⊥} G ‖}^{2} / (n - m)} \sim F_{m - ℓ, n - m}

$F = \frac{{\Vert P_Z G\Vert}^2/(m-\ell)}{{\Vert P_W^\perp G\Vert}^2/(n-m)} \sim F_{m-\ell,n-m}$ has the Fisher

F_{m - ℓ, n - m}

$F_{m-\ell,n-m}$ distribution. Consequently, from the classical relation between the Fisher distribution and the Beta distribution,

R^{2} \sim B (m - ℓ, n - m)

$R^2 \sim {\cal B}(m-\ell, n-m)$ .

In the general situation we have to deal with $P_Z Y = P_Z \mu + \sigma P_Z G$ when $P_Z\mu \neq 0$ . In this general case one has ${\Vert P_Z Y\Vert}^2 \sim \sigma^2\chi^2_{m-\ell}(\lambda)$ , the noncentral $\chi^2$ distribution with $m-\ell$ degrees of freedom and noncentrality parameter $\boxed{\lambda=\frac{{\Vert P_Z \mu\Vert}^2}{\sigma^2}}$ , and then $\boxed{F \sim F_{m-\ell,n-m}(\lambda)}$ (noncentral Fisher distribution). This is the classical result used to compute power of $F$ -tests.

The classical relation between the Fisher distribution and the Beta distribution hold in the noncentral situation too. Finally $R^2$ has the noncentral beta distribution with "shape parameters" $m-\ell$ and $n-m$ and noncentrality parameter $\lambda$ . I think the moments are available in the literature but they possibly are highly complicated.

Finally let us write down $P_Z\mu$ . Note that $P_Z = P_W - P_U$ . One has $P_U \mu = \bar\mu 1$ when $U=[1]$ , and $P_W \mu = \mu$ . Hence $P_Z \mu =\mu - \bar\mu 1$ where here $\mu=X\beta$ for the unknown parameters vector $\beta$ .

Stéphane Laurent
fuente

P_{Z} x

$P_Z x$ is the orthogoanl projection of

x

$x$ on the linear subspace

Z

$Z$ . And

P^{⊥}

$P^\perp$ denotes projection on the orthogonal.

Stéphane Laurent

Beware of

P x \neq ‖ P x ‖^{2}

$Px \neq \Vert P x \Vert^2$ . I'm going to edit my post to write the formulas.

Stéphane Laurent

Done - do you see any simplification ?

Stéphane Laurent

\bar{μ} = \frac{1}{n} \sum μ_{i}

$\bar \mu = \frac{1}{n} \sum \mu_i$

Stéphane Laurent

Type I, obviously: type II are distributed on

(0, \infty)

$(0, \infty)$ . Actually

R^{2} / (1 - R^{2})

$R^2/(1-R^2)$ has the type II distribution. I have done the last corrections for today.

Stéphane Laurent