Por ejemplo, en R
, la MASS::mvrnorm()
función es útil para generar datos para demostrar varias cosas en las estadísticas. Toma un Sigma
argumento obligatorio que es una matriz simétrica que especifica la matriz de covarianza de las variables. ¿Cómo crearía un n × n simétrico matrix with arbitrary entries?
21
Respuestas:
Create ann×n matrix A with arbitrary values
and then useΣ=ATA as your covariance matrix.
For example
fuente
Sigma <- A + t(A)
.I like to have control over the objects I create, even when they might be arbitrary.
Consider, then, that all possiblen×n covariance matrices Σ can be expressed in the form
whereP is an orthogonal matrix and σ1≥σ2≥⋯≥σn≥0 .
Geometrically this describes a covariance structure with a range of principal components of sizesσi . These components point in the directions of the rows of P . See the figures at Making sense of principal component analysis, eigenvectors & eigenvalues for examples with n=3 . Setting the σi will set the magnitudes of the covariances and their relative sizes, thereby determining any desired ellipsoidal shape. The rows of P orient the axes of the shape as you prefer.
One algebraic and computing benefit of this approach is that whenσn>0 , Σ is readily inverted (which is a common operation on covariance matrices):
Don't care about the directions, but only about the ranges of sizes of the theσi ? That's fine: you can easily generate a random orthogonal matrix. Just wrap n2 iid standard Normal values into a square matrix and then orthogonalize it. It will almost surely work (provided n isn't huge). The QR decomposition will do that, as in this code
This works because then -variate multinormal distribution so generated is "elliptical": it is invariant under all rotations and reflections (through the origin). Thus, all orthogonal matrices are generated uniformly, as argued at How to generate uniformly distributed points on the surface of the 3-d unit sphere?.
A quick way to obtainΣ from P and the σi , once you have specified or created them, uses σ=(σ1,…,σ5)=(5,4,3,2,1) :
crossprod
and exploitsR
's re-use of arrays in arithmetic operations, as in this example withAs a check, the Singular Value decomposition should return bothσ and P′ . You may inspect it with the command
The inverse ofσ into a division:
Sigma
of course is obtained merely by changing the multiplication byYou may verify this by viewingn×n identity matrix. A generalized inverse (essential for regression calculations) is obtained by replacing any σi≠0 by 1/σi , exactly as above, but keeping any zeros among the σi as they were.
zapsmall(Sigma %*% Tau)
, which should be thefuente
svd(Sigma)
will be reordered -- that confused me for a minute.You can simulate random positive definite matrices from the Wishart distribution using the function "rWishart" from the widely used package "stats".
fuente
There is a package specifically for that,
clusterGeneration
(written among other by Harry Joe, a big name in that field).There are two main functions:
genPositiveDefMat
generate a covariance matrix, 4 different methodsrcorrmatrix
: generate a correlation matrixQuick example:
Created on 2019-10-27 by the reprex package (v0.3.0)
Finally, note that an alternative approach is to do a first try from scratch, then use
Matrix::nearPD()
to make your matrix positive-definite.fuente