5
$\begingroup$

Here is a basic state space model (state equation $x_t$, observation equation $y_t$ and initial distribution of state):

$$x_t = x_{t-1} + w_t, \quad w_t \sim N(0, \sigma^2_w)$$ $$y_t = x_t + v_t, \quad v_t \sim N(0, \sigma^2_v)$$ $$x_0 \sim N(m_0, \sigma^2_0)$$

I want to add random effects to this model:

$$y_{jt} = \mu + \alpha_j + x_{jt} + v_{jt}, \quad v_{jt} \sim N(0, \sigma^2_v)$$ $$x_{jt} = x_{j,t-1} + w_{jt}, \quad w_{jt} \sim N(0, \sigma^2_w)$$ $$\alpha_j \sim N(0, \sigma^2_\alpha)$$ $$x_{j0} \sim N(m_0, \sigma^2_0)$$

I am finding difficulty trying to find references which show if this is possible and how to do this. I am trying to derive it myself using a general approach as if this was a linear mixed effects regression.

For parameters $\theta = (\mu, \sigma^2_w, \sigma^2_v, \sigma^2_\alpha, m_0, \sigma^2_0)$, the full density should be:

$$p(\mathbf{y}, \mathbf{x}, \boldsymbol{\alpha} | \theta) = p(\mathbf{y}|\mathbf{x}, \boldsymbol{\alpha}, \theta) \cdot p(\mathbf{x}|\theta) \cdot p(\boldsymbol{\alpha}|\theta)$$

For the observation equation, since $y_{jt} | x_{jt}, \alpha_j \sim N(\mu + \alpha_j + x_{jt}, \sigma^2_v)$ the likelihood is:

$$p(\mathbf{y}|\mathbf{x}, \boldsymbol{\alpha}, \theta) = (2\pi \sigma^2_v)^{-JT/2} \exp\left\{-\frac{1}{2\sigma^2_v}\sum_{j=1}^{J}\sum_{t=1}^{T}(y_{jt} - \mu - \alpha_j - x_{jt})^2\right\}$$

For the state equation, since $x_{jt} | x_{j,t-1} \sim N(x_{j,t-1}, \sigma^2_w)$ and $x_{j0} \sim N(m_0, \sigma^2_0)$:

$$p(\mathbf{x}|\theta) = (2\pi \sigma^2_0)^{-J/2} \exp\left\{-\frac{1}{2\sigma^2_0}\sum_{j=1}^{J}(x_{j0} - m_0)^2\right\} \times$$ $$(2\pi \sigma^2_w)^{-JT/2} \exp\left\{-\frac{1}{2\sigma^2_w}\sum_{j=1}^{J}\sum_{t=1}^{T}(x_{jt} - x_{j,t-1})^2\right\}$$

The random effects likelihood is:

$$p(\boldsymbol{\alpha}|\theta) = (2\pi\sigma^2_\alpha)^{-J/2} \exp\left\{-\frac{1}{2\sigma^2_\alpha}\sum_{j=1}^{J}\alpha_j^2\right\}$$

Combining everything together:

$$p(\mathbf{y}, \mathbf{x}, \boldsymbol{\alpha} | \theta) = (2\pi)^{-J(2T+1)/2} (\sigma^2_v)^{-JT/2} (\sigma^2_w)^{-JT/2} (\sigma^2_0)^{-J/2} (\sigma^2_\alpha)^{-J/2} \times$$ $$\exp\left\{-\frac{1}{2}\left[\frac{1}{\sigma^2_v}\sum_{j=1}^{J}\sum_{t=1}^{T}(y_{jt} - \mu - \alpha_j - x_{jt})^2 + \frac{1}{\sigma^2_w}\sum_{j=1}^{J}\sum_{t=1}^{T}(x_{jt} - x_{j,t-1})^2\right.\right.$$ $$\left.\left.+ \frac{1}{\sigma^2_0}\sum_{j=1}^{J}(x_{j0} - m_0)^2 + \frac{1}{\sigma^2_\alpha}\sum_{j=1}^{J}\alpha_j^2\right]\right\}$$

The marginal likelihood requires integration (I haven't worked everything out yet, but I think it might be normal):

$$p(\mathbf{y}|\theta) = \int \int p(\mathbf{y}, \mathbf{x}, \boldsymbol{\alpha} | \theta) \, d\mathbf{x} \, d\boldsymbol{\alpha}$$

From here, the plan is to estimate the posterior distributions of all parameters using Bayesian. After placing priors on all parameters, I would use this likelihood along with some version of MCMC to estimate posteriors of all parameters (and completely bypass the Kalman Filter) .

Is this a statistically valid estimation process for adding random effects to a model like this?

$\endgroup$
5
  • 2
    $\begingroup$ I am not an expert in this area, but your formulation reminds me of a recent(ish) development in the structural equation modeling (SEM) world, called dynamic SEM (DSEM). DSEM allows you to do time series analysis when $N$ > 1 by incorporating random effects. See tandfonline.com/doi/full/10.1080/00273171.2018.1446819. It is implemented in the Mplus software. $\endgroup$ Commented Oct 19 at 18:21
  • $\begingroup$ thanks will take a look ... $\endgroup$ Commented Oct 21 at 15:16
  • $\begingroup$ do you like what I have presented here? $\endgroup$ Commented Oct 21 at 15:16
  • $\begingroup$ Unfortunately, I do not have enough experience in the state-space modeling paradigm to be a good evaluator. $\endgroup$ Commented Oct 23 at 21:34
  • $\begingroup$ thank you ... hoping someone can give some feedback here... $\endgroup$ Commented Oct 24 at 0:03

1 Answer 1

1
$\begingroup$

Your maths looks correct to me. It's always possible to add random effects to a model. Other questions to ask are, why are you doing it? and are these effects estimable from the data you have available? It seems like, by adding random effects in this way, you're changing the data requirements from a single time series $(y_t)$ to a matrix of values $(y_{jt})$. Not that that's a problem but it's important to bear in mind.

As I said, what you've done seems correct to me so far. In practice some kind of software would be used to carry out the model fitting computations. For example, the NIMBLE R package could be used as follows - I generated some simulated data and then used NIMBLE to fit the model.

library(nimble)
library(MCMCvis)

# Generate simulated data

J=10
N=10
sa=1
m0=0.2
s0=1.5
sw=0.3
mu=0
sv=0.4


alpha=rnorm(J,mean=0,sd=sa)
x0=rnorm(J,mean=m0,sd=s0)
x=array(0,dim=c(J,N+1))
y=array(0,dim=c(J,N+1))
x[,1]=x0
for(i in 1:N)
{
  x[,i+1]=x[,i]+rnorm(J,mean=0,sd=sw)
}

for(i in 1:(N+1))
{
  y[,i]=x[,i]+mu+alpha+rnorm(J,mean=0,sd=sv)
}


# NIMBLE Model code and set-up

code <- nimbleCode({
  
  for(i in 1:(N+1)) {
    for(j in 1:J) {
      y[j,i] ~ dnorm(x[j,i]+mu+alpha[j],sd=sv)
    }
  }
  
  for(j in 1:J) {
    x[j,1] ~ dnorm(m0,sd=s0)
  }
  
  for(i in 2:(N+1)) {
    for(j in 1:J) {
      x[j,i] ~ dnorm(x[j,i-1],sd=sw)
    }
  }
  
  for(i in 1:J) {
    alpha[i] ~ dnorm(0,sd=sa)
  }
  
  sa ~ dunif(min=0,max=5)
  s0 ~ dunif(min=0,max=5)
  sw ~ dunif(min=0,max=5)
  sv ~ dunif(min=0,max=5)
  mu ~ dunif(min=-5,max=5)
  m0 ~ dunif(min=-5,max=5)
})

data <- list(y=y)
consts <- list(N=10,J=10)

inits<-list(mu=0,m0=0,sa=1,sv=1,sw=1,s0=1)

model<-nimbleModel(code=code,name="model",data=data,inits=inits,constants=consts)

mConf <- configureMCMC(model, print = TRUE,monitors=c("mu","m0","sa","sv","sw","s0"))
mMCMC <- buildMCMC(mConf)
Cm <- compileNimble(model)
CmMCMC <- compileNimble(mMCMC, project = Cm)  
samples <- runMCMC(CmMCMC)

s1=MCMCsummary(samples)

However, there is a problem with the model as it stands. You'll see that the expected value of $(y_{jt})$ is equal to $m_0 + \mu$, which means that we cannot disentangle the effects of $\mu$ and $m_0$ if we only observe $(y_{jt})$. So it's probably better to replace $m_0$ with $0$ and remove it as a parameter to be estimated. This involves minor modifications to the code above.

I tried fitting this model but the results weren't great. Another issue is that the variance of $y_t$ is $\sigma_0^2+t \sigma_w^2+\sigma_v^2$, which suggests that it's not possible to disentangle $\sigma_0$ and $\sigma_v$. I tried setting $x_0=0$ and removing $\sigma_0$ from the model, so that there are just 4 parameters. Again, small modifications to the code are needed. This approach seems to work better, as now all the parameters of the resultant model are identifiable.

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.