svcm is an R-package intended for (relatively) easy specification of models combining structural equation models with sparse relationship matrices. The structural equations model involve unknown parameters that describe dependence across variables. The relationship matrix may for example encode genetic relatedness or different forms of shared environments and describe dependence across individuals. The interface is inspired by OpenMx, but uses the Matrix package for computation.
Consider a factor model for individual
The genetic components
Across variables and individuals, the covariance matrix implied by this model can be described as
Below is a simulation of data from the model.
# Genetic relationships
str1 <- matrix(c(1, 1/4, 1/4, 1/4, 1, 1/2, 1/4, 1/2, 1), 3, 3)
str2 <- matrix(c(1, 1/2, 1/8, 1/8, 1/2, 1, 1/8, 1/8, 1/8, 1/8, 1, 1/2, 1/8, 1/8, 1/2, 1), 4, 4)
# Genetic relationship matrix
A <- Matrix::bdiag(Matrix::bdiag(replicate(100, str1, simplify = FALSE)),
Matrix::bdiag(replicate(100, str2, simplify = FALSE)))
L <- chol(A)
N <- nrow(A)
a <- sqrt(2) * t(L) %*% rnorm(N)
e <- rnorm(N, 0, sqrt(2))
eta <- a + e
l <- c(1, 0.5, 0.5, 0.8)
b = c(2, 2, 4, 4)
del <- matrix(rnorm(N * 4), N, 4)
x <- rep(1, N)
Y <- x %*% t(b) + eta %*% t(l) + del
There are several ways to fit this model using svcm, but it will
likely be somewhat similar to the example below. pm()
is used to
define matrices of parameters, svc()
is used to define structured variance components, the terms
defining the covariance matrix. Notice that the first argument is an
expression defining the model implied covariance structure across
variables as a function of parameters. The second argument R =
is used
to supply a relationship matrix describing the dependence across
individuals. mc()
is used to define the mean structure. The first
argument is an expression involving parameters and the second argument
X =
gives a model matrix of covariates. There is no limit to how many
pm
, svc
and mc
objects that can be supplied, they will be added
together to form the total mean and covariance structure.
library(svcm)
# Environmental relationship matrix
I <- Matrix::Diagonal(N)
m1 <- svcm(
#Data
Y,
# Parameters
pm(nrow = 4, ncol = 1, labels = paste0("l", 1:4), free = c(F, T, T, T), values = c(1, 0.5, 0.5, 0.5), name = "l"),
pm(nrow = 1, ncol = 1, labels = "Sa1", free = T, values = 1, name = "Sa"),
pm(nrow = 1, ncol = 1, labels = "Se1", free = T, values = 1, name = "Se"),
pm(nrow = 4, ncol = 4, labels = sapply(1:4, function(x) paste0("th", 1:4, x)), free = diag(T, 4), values = diag(4), name = "TH"),
pm(nrow = 4, ncol = 1, labels = paste0("b", 1:4), free = T, values = 0, name = "b"),
# Variance components
svc(l %*% Sa %*% t(l), R = A),
svc(l %*% Se %*% t(l) + TH, R = I),
# Mean components
mc(b, X = x)
)
After the model is defined, it can be fitted using fit_svcm()
.
m1 <- fit_svcm(m1, se = TRUE, control = list(trace = 10))
iter: objective: l2 l3 l4 Sa1 Se1 th11 th22 th33 th44 b1 b2 b3 b4
0: 21278.520: 0.500000 0.500000 0.500000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 0.00000 0.00000 0.00000 0.00000
10: 10162.318: 0.792366 0.812250 1.20572 2.33724 1.36836 2.00308 0.911362 0.980087 0.601323 -0.140794 0.348083 2.43695 1.49357
20: 9464.0026: 0.536531 0.528660 0.815410 2.76205 1.57875 1.23264 0.967970 0.906137 0.848656 1.46795 1.64950 3.74448 3.51519
30: 9438.4227: 0.509054 0.499686 0.802022 2.51326 1.58941 1.16397 0.953687 0.945138 0.866413 1.82941 1.85522 3.94124 3.81643
40: 9436.5278: 0.505799 0.502774 0.796290 2.24901 1.80736 1.12947 0.950205 0.943716 0.886813 1.88930 1.88995 3.97082 3.86238
50: 9436.0623: 0.504551 0.504001 0.797636 2.09106 1.93504 1.12729 0.952978 0.940317 0.886875 1.89121 1.89125 3.97327 3.86504
60: 9435.7612: 0.506152 0.503231 0.797753 1.91021 2.07479 1.12302 0.953801 0.943535 0.889764 1.89201 1.89158 3.97282 3.86487
70: 9435.6783: 0.505112 0.502485 0.798160 1.80524 2.18601 1.12564 0.954498 0.941676 0.889545 1.89308 1.89183 3.97412 3.86565
80: 9435.6745: 0.505145 0.502728 0.798128 1.77579 2.20829 1.12502 0.954498 0.941549 0.890174 1.89324 1.89149 3.97358 3.86560
Computing standard errors.
After the models is fitted, it can be inspected with summary()
.
summary(m1)
Log likelihood Deviance AIC BIC
-4717.837 9435.674 9461.674 9538.860
Fitted parameters:
Estimate Std. Error z value Pr(>|z|)
l2 0.5051 0.0234 21.63 <0.001
l3 0.5027 0.0233 21.54 <0.001
l4 0.7981 0.0290 27.57 <0.001
Sa1 1.7758 0.5175 3.43 <0.001
Se1 2.2083 0.4832 4.57 <0.001
th11 1.1250 0.1099 10.23 <0.001
th22 0.9545 0.0586 16.30 <0.001
th33 0.9415 0.0579 16.26 <0.001
th44 0.8902 0.0761 11.70 <0.001
b1 1.8932 0.0954 19.84 <0.001
b2 1.8915 0.0572 33.04 <0.001
b3 3.9736 0.0569 69.81 <0.001
b4 3.8656 0.0778 49.71 <0.001
svcm can fit models where the covariance matrix can be described as
The terms in the sum are specified by adding objects returned from
svc()
to the model. pm()
and
can be specified flexibly using general functions available in R.
The means are described with a similar structure
For the means, the terms in the sum are specified by adding objects
returned from mc()
to the model.
The data is described with the normal distribution