3
$\begingroup$

I have relational data, i.e. observations for pairs of objects. More specifically these are migration rates between plant populations, which I would like to explain by a predictor. The migration rates are often zero, but sometimes positive, thus I would like to model the migration rates as a function of the predictor with a zero-inflated, binomial model using glmmTMB:

library(glmmTMB)
#Create data
d <- data.frame(P1 = c('F01', 'F01', 'F01', 'F01', 'F01', 'F01', 'F01', 'F01', 'F04', 'F04', 'F04', 'F04', 'F04', 'F04', 'F04', 'F06', 'F06', 'F06', 'F06', 'F06', 'F06', 'F07', 'F07', 'F07', 'F07', 'F07', 'F08', 'F08', 'F08', 'F08', 'F10', 'F10', 'F10', 'F45', 'F45', 'F48'), #population 1 in the pair
               P2 = c('F04', 'F06', 'F07', 'F08', 'F10', 'F45', 'F48', 'F51', 'F06', 'F07', 'F08', 'F10', 'F45', 'F48', 'F51', 'F07', 'F08', 'F10', 'F45', 'F48', 'F51', 'F08', 'F10', 'F45', 'F48', 'F51', 'F10', 'F45', 'F48', 'F51', 'F45', 'F48', 'F51', 'F48', 'F51', 'F51'), #population 2 in the pair
               m = c(0.012008, 0, 0, 0, 0, 0, 0.001813, 0, 0.007568, 0.005158, 0, 0, 0.003051, 0.008608, 0.008016, 0, 0.002192, 0.001471, 0, 0, 0, 0.003279, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.001856, 0, 0, 0, 0.001138), #migration rate
               pred = c(-3.43148, -0.225262, -0.771442, -0.082734, -0.473787, -0.510893, -1.608012, 0.071566, -0.043279, -0.174624, -0.06861, -0.178651, 0.302419, -1.45274, 0.142605, -0.988881, 0.170282, 0.485709, 0.309536, -0.43994, 1.372506, -0.45068, 0.032214, 0.510928, -0.700932, 1.169033, 0.336879, 0.591839, -0.463147, 2.471162, 0.389965, -0.372983, 1.385998, 0.894272, 1.053415, 0.747749), #a predictor
               N.pair= c(412, 496, 354, 325, 521, 226, 875, 221, 520, 378, 349, 545, 250, 899, 245, 462, 433, 629, 334, 983, 329, 291, 487, 192, 841, 187, 458, 163, 812, 158, 359, 1008, 354, 713, 59, 708)) #the total number of individuals in the population pair, of which a proportion m are migrants
    
    #Fit zero-inflated, binomial model
    fit1 <- glmmTMB(m ~ pred, zi=~pred, data=d, family=binomial, weights=N.pair)
    summary(fit1)

This model ignores the fact that the pairs are not independent of each other. Pairs sharing a common population might behave more similar than other pairs. Therefore, a correlation structure must be defined. In the nlme package, I can do this with the function corMLPE of the package with the same name:

library(nlme)
library(corMLPE)
fit2 <- gls(m ~ pred, data=d, correlation=corMLPE(form=~P1+P2))
summary(fit2)

However, this model ignores that m represents zero-inflated proportion data.

Is it possible to define a correlation structure for relational data in glmmTMB similar to that created with corMLPE? I have read the vignette of Kasper Kristensen and Maeve McGillycuddy, but still I don’t understand whether and how it’s possible to define an appropriate correlation structure. Does anyone know how to do it?

$\endgroup$
4
  • $\begingroup$ Hi @Tobias Naaf, welcome to CV! I don't fully understand what relationships you're expecting, but I suspect adding + (1|P1 + P2) to your model formula will roughly do what you want. The vignette you linked has the rather complex full range of correlation structures. I am guessing you just want a random intercept term. The above example is a simple additive RI term for population P1, and a separate/independent one for P2. You can modify as you wish. $\endgroup$ Commented Nov 18 at 4:13
  • $\begingroup$ See the vignette cran.r-project.org/web/packages/glmmTMB/vignettes/glmmTMB.pdf for more information $\endgroup$ Commented Nov 18 at 4:14
  • $\begingroup$ Thank you, @Alex J. I think adding + (1|P1 + P2) to the formula will not help here. This would mean to add a single random intercept term for the interaction between P1 and P2. Thus, a random intercept is estimated for each combination of the levels in P1 and P2. Given that there is only one element for each of such combinations in the sample, such a model is not fitable. $\endgroup$ Commented Nov 18 at 9:01
  • $\begingroup$ the term I suggested has P1 and P2 as independent random effects on the intercept - it's not an interaction term. I still don't know if it is right thing for your experiment from a theory point though. $\endgroup$ Commented Nov 18 at 10:36

1 Answer 1

1
$\begingroup$

This is feasible, by setting up the random effects model matrix to match the Clarke model. A similar example (for multi-membership models) is here.

Here are the results from the gls fit above:

 printCoefmat(summary(fit2)$tTable, digits = 3)
                Value Std.Error t-value p-value    
(Intercept)  0.001560  0.000988    1.58 0.12375    
pred        -0.001829  0.000493   -3.71 0.00073 ***

And from the model fitted below:

 coef(summary(fit3, ddf = "kenward-roger"))
$cond
                Estimate   Std. Error   t value       ddf     Pr(>|t|)
(Intercept)  0.001559944 0.0009883464  1.578338  5.765884 0.1675588814
pred        -0.001828988 0.0004993087 -3.663040 29.125652 0.0009857646

The standard errors (and Z/t-statistics) are not quite the same; also, the degrees-of-freedom calculations are slightly different (I believe gls uses df=34)

The estimated correlation is the same:

s = sigma(fit3)
tau = sqrt(c(VarCorr(fit3)$cond[[1]]))
tau^2/(2*tau^2 + s^2)  ## 0.2638
library(glmmTMB)
library(Matrix)
## make a fake factor variable that has all of the required factor levels:
##  the assignment to particular observations doesn't matter right now
d$P12 <- factor(rep(union(d$P1, d$P2), length.out = nrow(d)))

## set up the model
gt0 <- glmmTMB(m~pred+(1|P12), data=d, doFit = FALSE, REML = TRUE)
gt1 <- fitTMB(gt0, doOptim = FALSE)

## construct the random-effects model matrix (W -> Z)
W <- matrix(0, nrow = nrow(d), ncol = length(levels(d$P12)))
colnames(W) <- levels(d$P12)
for (i in 1:nrow(d)) {
  W[i, d$P1[i]] <- 1
  W[i, d$P2[i]] <- 1
}
W <- as(Matrix(W), "TsparseMatrix")

## checking
image(W)
image(tcrossprod(W))

## correlation matrix from fitted model above (some hacking needed)
cc <- corMatrix(fit2$modelStruct$corStruct)
cc2 <- new("dgCMatrix", i = cc@i, x = cc@x, p = cc@p, Dim = cc@Dim)
image(cc2)

## now hack data and parameters

gt1$env$data$Z <- W
gt1$env$parameters$b <- rep(1, ncol(W))
## rebuild TMB object
gt1 <- with(gt1$env,
                TMB::MakeADFun(data,
                               parameters,
                               map = map,
                               random = random,
                               silent = silent,
                               DLL = "glmmTMB"))

## fit
fit <- with(gt1, nlminb(par, objective = fn, gr = gr))
fit3 <- finalizeTMB(gt0, gt1, fit)
```
$\endgroup$
2
  • $\begingroup$ Thank you, Ben, for this nice demonstration. In particular, it was good to see that tcrossprod(W) corresponds to the correlation matrix constructed with corMLPE. I guess, I can use the same procedure with a zero-inflated, binomial model when I replace not only gt1$env$data$Zby W but also gt1$env$data$Zzi (assuming the same random effect structure in the conditional and zero-inflated model part)? $\endgroup$ Commented Nov 25 at 11:30
  • $\begingroup$ Yes, that seems right. (And also replace $env$parameters$bzi) $\endgroup$ Commented Nov 25 at 14:29

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.