I have relational data, i.e. observations for pairs of objects. More specifically these are migration rates between plant populations, which I would like to explain by a predictor. The migration rates are often zero, but sometimes positive, thus I would like to model the migration rates as a function of the predictor with a zero-inflated, binomial model using
glmmTMB:
library(glmmTMB)
#Create data
d <- data.frame(P1 = c('F01', 'F01', 'F01', 'F01', 'F01', 'F01', 'F01', 'F01', 'F04', 'F04', 'F04', 'F04', 'F04', 'F04', 'F04', 'F06', 'F06', 'F06', 'F06', 'F06', 'F06', 'F07', 'F07', 'F07', 'F07', 'F07', 'F08', 'F08', 'F08', 'F08', 'F10', 'F10', 'F10', 'F45', 'F45', 'F48'), #population 1 in the pair
P2 = c('F04', 'F06', 'F07', 'F08', 'F10', 'F45', 'F48', 'F51', 'F06', 'F07', 'F08', 'F10', 'F45', 'F48', 'F51', 'F07', 'F08', 'F10', 'F45', 'F48', 'F51', 'F08', 'F10', 'F45', 'F48', 'F51', 'F10', 'F45', 'F48', 'F51', 'F45', 'F48', 'F51', 'F48', 'F51', 'F51'), #population 2 in the pair
m = c(0.012008, 0, 0, 0, 0, 0, 0.001813, 0, 0.007568, 0.005158, 0, 0, 0.003051, 0.008608, 0.008016, 0, 0.002192, 0.001471, 0, 0, 0, 0.003279, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.001856, 0, 0, 0, 0.001138), #migration rate
pred = c(-3.43148, -0.225262, -0.771442, -0.082734, -0.473787, -0.510893, -1.608012, 0.071566, -0.043279, -0.174624, -0.06861, -0.178651, 0.302419, -1.45274, 0.142605, -0.988881, 0.170282, 0.485709, 0.309536, -0.43994, 1.372506, -0.45068, 0.032214, 0.510928, -0.700932, 1.169033, 0.336879, 0.591839, -0.463147, 2.471162, 0.389965, -0.372983, 1.385998, 0.894272, 1.053415, 0.747749), #a predictor
N.pair= c(412, 496, 354, 325, 521, 226, 875, 221, 520, 378, 349, 545, 250, 899, 245, 462, 433, 629, 334, 983, 329, 291, 487, 192, 841, 187, 458, 163, 812, 158, 359, 1008, 354, 713, 59, 708)) #the total number of individuals in the population pair, of which a proportion m are migrants
#Fit zero-inflated, binomial model
fit1 <- glmmTMB(m ~ pred, zi=~pred, data=d, family=binomial, weights=N.pair)
summary(fit1)
This model ignores the fact that the pairs are not independent of each other. Pairs sharing a common population might behave more similar than other pairs. Therefore, a correlation structure must be defined. In the nlme package, I can do this with the function corMLPE of the package with the same name:
library(nlme)
library(corMLPE)
fit2 <- gls(m ~ pred, data=d, correlation=corMLPE(form=~P1+P2))
summary(fit2)
However, this model ignores that m represents zero-inflated proportion data.
Is it possible to define a correlation structure for relational data in glmmTMB similar to that created with corMLPE? I have read the vignette of Kasper Kristensen and Maeve McGillycuddy, but still I don’t understand whether and how it’s possible to define an appropriate correlation structure. Does anyone know how to do it?
+ (1|P1 + P2)to your model formula will roughly do what you want. The vignette you linked has the rather complex full range of correlation structures. I am guessing you just want a random intercept term. The above example is a simple additive RI term for population P1, and a separate/independent one for P2. You can modify as you wish. $\endgroup$+ (1|P1 + P2)to the formula will not help here. This would mean to add a single random intercept term for the interaction between P1 and P2. Thus, a random intercept is estimated for each combination of the levels in P1 and P2. Given that there is only one element for each of such combinations in the sample, such a model is not fitable. $\endgroup$