How to define a correlation structure for relational data in glmmTMB?

Question

I have relational data, i.e. observations for pairs of objects. More specifically these are migration rates between plant populations, which I would like to explain by a predictor. The migration rates are often zero, but sometimes positive, thus I would like to model the migration rates as a function of the predictor with a zero-inflated, binomial model using glmmTMB:

library(glmmTMB)
#Create data
d <- data.frame(P1 = c('F01', 'F01', 'F01', 'F01', 'F01', 'F01', 'F01', 'F01', 'F04', 'F04', 'F04', 'F04', 'F04', 'F04', 'F04', 'F06', 'F06', 'F06', 'F06', 'F06', 'F06', 'F07', 'F07', 'F07', 'F07', 'F07', 'F08', 'F08', 'F08', 'F08', 'F10', 'F10', 'F10', 'F45', 'F45', 'F48'), #population 1 in the pair
               P2 = c('F04', 'F06', 'F07', 'F08', 'F10', 'F45', 'F48', 'F51', 'F06', 'F07', 'F08', 'F10', 'F45', 'F48', 'F51', 'F07', 'F08', 'F10', 'F45', 'F48', 'F51', 'F08', 'F10', 'F45', 'F48', 'F51', 'F10', 'F45', 'F48', 'F51', 'F45', 'F48', 'F51', 'F48', 'F51', 'F51'), #population 2 in the pair
               m = c(0.012008, 0, 0, 0, 0, 0, 0.001813, 0, 0.007568, 0.005158, 0, 0, 0.003051, 0.008608, 0.008016, 0, 0.002192, 0.001471, 0, 0, 0, 0.003279, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.001856, 0, 0, 0, 0.001138), #migration rate
               pred = c(-3.43148, -0.225262, -0.771442, -0.082734, -0.473787, -0.510893, -1.608012, 0.071566, -0.043279, -0.174624, -0.06861, -0.178651, 0.302419, -1.45274, 0.142605, -0.988881, 0.170282, 0.485709, 0.309536, -0.43994, 1.372506, -0.45068, 0.032214, 0.510928, -0.700932, 1.169033, 0.336879, 0.591839, -0.463147, 2.471162, 0.389965, -0.372983, 1.385998, 0.894272, 1.053415, 0.747749), #a predictor
               N.pair= c(412, 496, 354, 325, 521, 226, 875, 221, 520, 378, 349, 545, 250, 899, 245, 462, 433, 629, 334, 983, 329, 291, 487, 192, 841, 187, 458, 163, 812, 158, 359, 1008, 354, 713, 59, 708)) #the total number of individuals in the population pair, of which a proportion m are migrants
    
    #Fit zero-inflated, binomial model
    fit1 <- glmmTMB(m ~ pred, zi=~pred, data=d, family=binomial, weights=N.pair)
    summary(fit1)

This model ignores the fact that the pairs are not independent of each other. Pairs sharing a common population might behave more similar than other pairs. Therefore, a correlation structure must be defined. In the nlme package, I can do this with the function corMLPE of the package with the same name:

library(nlme)
library(corMLPE)
fit2 <- gls(m ~ pred, data=d, correlation=corMLPE(form=~P1+P2))
summary(fit2)

However, this model ignores that m represents zero-inflated proportion data.

Is it possible to define a correlation structure for relational data in glmmTMB similar to that created with corMLPE? I have read the vignette of Kasper Kristensen and Maeve McGillycuddy, but still I don’t understand whether and how it’s possible to define an appropriate correlation structure. Does anyone know how to do it?

Hi @Tobias Naaf, welcome to CV! I don't fully understand what relationships you're expecting, but I suspect adding + (1|P1 + P2) to your model formula will roughly do what you want. The vignette you linked has the rather complex full range of correlation structures. I am guessing you just want a random intercept term. The above example is a simple additive RI term for population P1, and a separate/independent one for P2. You can modify as you wish. — Alex J
– Alex J, Commented Nov 18 at 4:13
See the vignette cran.r-project.org/web/packages/glmmTMB/vignettes/glmmTMB.pdf for more information — Alex J
– Alex J, Commented Nov 18 at 4:14
Thank you, @Alex J. I think adding + (1|P1 + P2) to the formula will not help here. This would mean to add a single random intercept term for the interaction between P1 and P2. Thus, a random intercept is estimated for each combination of the levels in P1 and P2. Given that there is only one element for each of such combinations in the sample, such a model is not fitable. — Tobias Naaf
– Tobias Naaf, Commented Nov 18 at 9:01
the term I suggested has P1 and P2 as independent random effects on the intercept - it's not an interaction term. I still don't know if it is right thing for your experiment from a theory point though. — Alex J
– Alex J, Commented Nov 18 at 10:36

Ben Bolker · Accepted Answer · 2025-11-22 23:40:06Z

This is feasible, by setting up the random effects model matrix to match the Clarke model. A similar example (for multi-membership models) is here.

Here are the results from the gls fit above:

 printCoefmat(summary(fit2)$tTable, digits = 3)
                Value Std.Error t-value p-value    
(Intercept)  0.001560  0.000988    1.58 0.12375    
pred        -0.001829  0.000493   -3.71 0.00073 ***

And from the model fitted below:

 coef(summary(fit3, ddf = "kenward-roger"))
$cond
                Estimate   Std. Error   t value       ddf     Pr(>|t|)
(Intercept)  0.001559944 0.0009883464  1.578338  5.765884 0.1675588814
pred        -0.001828988 0.0004993087 -3.663040 29.125652 0.0009857646

The standard errors (and Z/t-statistics) are not quite the same; also, the degrees-of-freedom calculations are slightly different (I believe gls uses df=34)

The estimated correlation is the same:

s = sigma(fit3)
tau = sqrt(c(VarCorr(fit3)$cond[[1]]))
tau^2/(2*tau^2 + s^2)  ## 0.2638

library(glmmTMB)
library(Matrix)
## make a fake factor variable that has all of the required factor levels:
##  the assignment to particular observations doesn't matter right now
d$P12 <- factor(rep(union(d$P1, d$P2), length.out = nrow(d)))

## set up the model
gt0 <- glmmTMB(m~pred+(1|P12), data=d, doFit = FALSE, REML = TRUE)
gt1 <- fitTMB(gt0, doOptim = FALSE)

## construct the random-effects model matrix (W -> Z)
W <- matrix(0, nrow = nrow(d), ncol = length(levels(d$P12)))
colnames(W) <- levels(d$P12)
for (i in 1:nrow(d)) {
  W[i, d$P1[i]] <- 1
  W[i, d$P2[i]] <- 1
}
W <- as(Matrix(W), "TsparseMatrix")

## checking
image(W)
image(tcrossprod(W))

## correlation matrix from fitted model above (some hacking needed)
cc <- corMatrix(fit2$modelStruct$corStruct)
cc2 <- new("dgCMatrix", i = cc@i, x = cc@x, p = cc@p, Dim = cc@Dim)
image(cc2)

## now hack data and parameters

gt1$env$data$Z <- W
gt1$env$parameters$b <- rep(1, ncol(W))
## rebuild TMB object
gt1 <- with(gt1$env,
                TMB::MakeADFun(data,
                               parameters,
                               map = map,
                               random = random,
                               silent = silent,
                               DLL = "glmmTMB"))

## fit
fit <- with(gt1, nlminb(par, objective = fn, gr = gr))
fit3 <- finalizeTMB(gt0, gt1, fit)
```

Thank you, Ben, for this nice demonstration. In particular, it was good to see that tcrossprod(W) corresponds to the correlation matrix constructed with corMLPE. I guess, I can use the same procedure with a zero-inflated, binomial model when I replace not only gt1$env$data$Zby W but also gt1$env$data$Zzi (assuming the same random effect structure in the conditional and zero-inflated model part)? — Tobias Naaf
– Tobias Naaf, Commented Nov 25 at 11:30
Yes, that seems right. (And also replace $env$parameters$bzi) — Ben Bolker
– Ben Bolker, Commented Nov 25 at 14:29

Stack Exchange Network

How to define a correlation structure for relational data in glmmTMB?

1 Answer 1

Your Answer

Hot Network Questions

How to define a correlation structure for relational data in glmmTMB?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions