Stochastic process based on mixed effects regression

Question

Suppose I have $n$ items.
Each item has some baseline properties that are constant. For example, the $i^{th}$ item has properties (i.e. covariates) $w_i$ and $z_i$ that are constant. Each item has a property $X_t$ which follows a stochastic process.
For the $i^{th}$ item, $X_t$ can be defined as :

$$ X_{t+1} = \begin{cases} X_t + \epsilon_t & \text{if } X_t > 0 \\ 0 & \text{if } X_t \leq 0 \end{cases} $$

$$ \epsilon_t \sim N(\mu_t, \sigma^2) $$

$$ \mu_t = -kX_t $$

I showed some simulations as to how this stochastic process will look - I purposefully wanted this stochastic process to decay towards 0 as time goes on:

Without taking the baseline properties into consideration, to estimate the parameters of this stochastic process, I can write the likelihood and optimize it:

For a single transition when $X_t > 0$, the probability density is:

$$ f(X_{t+1}|X_t; k, \sigma^2) = \frac{1}{\sigma\sqrt{2\pi}} > \exp\left(-\frac{(X_{t+1} - X_t - k)^2}{2\sigma^2}\right) $$

When $X_t \leq 0$ (by definition):

$$ f(X_{t+1}|X_t; k, \sigma^2) = \mathbb{1}(X_{t+1} = 0) $$

For multiple trajectories $i = 1,...,N$, each of length $T$, the complete likelihood function is (to be numerically solved):

$$ \mathcal{L}(k, \sigma^2) = \prod_{i=1}^N \prod_{t=1}^{T-1} \left[ > \left(\frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(X^i_{t+1} - X^i_t > - k)^2}{2\sigma^2}\right)\right)^{\mathbb{1}(X^i_t > 0)} \cdot (1)^{\mathbb{1}(X^i_t \leq 0)} \right] $$

$$ \log \mathcal{L}(k, \sigma^2) = \sum_{i=1}^N \sum_{t=1}^{T-1} > \mathbb{1}(X^i_t > 0) \left[-\frac{1}{2}\log(2\pi\sigma^2) - > \frac{(X^i_{t+1} - X^i_t - k)^2}{2\sigma^2}\right] $$

$$ \frac{\partial \log \mathcal{L}}{\partial k} = ...$$

$$ \frac{\partial \log \mathcal{L}}{\partial \sigma^2} = ...$$

I have the following question: Suppose I believe that the rate of decay depends on the baseline properties of each item. Can I model this using a mixed effects regression model and then link this to the stochastic process?

Naively, I would write the decay as a mixed effects model:

$$ k_i = \beta_0 + \beta_1w_i + \beta_2z_i + u_i $$ $$ u_i \sim N(0, \tau^2) $$

This would then result in a modified stochastic process:

$$ X_{t+1} = \begin{cases} X_t + \epsilon_t & \text{if } X_t > 0 \\ 0 & \text{if } X_t \leq 0 \end{cases} $$ $$ \epsilon_t \sim N(\mu_t, \sigma^2) $$ $$ \mu_t = -(k_i)X_t = -(\beta_0 + \beta_1w_i + \beta_2z_i + u_i)X_t $$

Finally, as is done in mixed effects regression, I would write the likelihood with an integral to take into consideration the probabilistic element of the random effects (numerically optimized using more involved methods, e.g. Hermite quadrature, Laplace):

$$ f(X_{t+1}|X_t, w_i, z_i; \beta_0, \beta_1, \beta_2, \tau^2, \sigma^2) = \int_{u_i} \frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(X_{t+1} - X_t + (\beta_0 + \beta_1w_i + \beta_2z_i + u_i)X_t)^2}{2\sigma^2}\right) \cdot \frac{1}{\tau\sqrt{2\pi}} \exp\left(-\frac{u_i^2}{2\tau^2}\right) du_i $$

$$ \log \mathcal{L}(\beta_0, \beta_1, \beta_2, \tau^2, \sigma^2) = \sum_{i=1}^N \log \int_{u_i} \prod_{t=1}^{T-1} \left[ \left(\frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(X^i_{t+1} - X^i_t + (\beta_0 + \beta_1w_i + \beta_2z_i + u_i)X^i_t)^2}{2\sigma^2}\right)\right)^{\mathbb{1}(X^i_t > 0)} \cdot (1)^{\mathbb{1}(X^i_t \leq 0)} \right] \cdot \frac{1}{\tau\sqrt{2\pi}} \exp\left(-\frac{u_i^2}{2\tau^2}\right) du_i $$

Are these kinds of approaches ever taken in statistics? Can stochastic processes have parameters that are estimated using mixed effects regression?

I'm unfamilar with the approach you are currently taking but there an econometric model that is almost able to get you where you want to go except that it's usually used when there is only one explanatory variable. Two variables could possible be incorporated but I haven't seen such an approach. The model is called the koyck distributed lag ( it is well known in the econometrics literature ) and can be viewed as a special case of an ARDL. If you google for "koyck distributed lag" or even search this site, many references will arise. I could not recommend one only because there are too many. — mlofton
– mlofton, Commented Feb 8 at 1:41
I'm not familar enough with your approach ( to see any major problems ) but I can see that you are using 2 variables in the model for the rate which means that you would need 2 variables in your distributed lag also. That's the only issue I see off of the top of my head. — mlofton
– mlofton, Commented Feb 8 at 1:43

Théo Michelot · Accepted Answer · 2025-03-28 20:52:28Z

2

Yes, these types of models are sometimes used. See for example:

Picchini, Gaetano & Ditlevsen (2010). Stochastic differential mixed‐effects models. Scandinavian Journal of statistics, 37(1), 67-90.

I worked on an R package called smoothSDE, which can fit such models. It uses the Laplace approximation when random effects are included (through the package Template Model Builder). The model you propose is somewhat similar to an Ornstein-Uhlenbeck process, which is implemented in smoothSDE.

General description of the methods: Michelot, Glennie, Harris & Thomas (2021). Varying-coefficient stochastic differential equations with applications in ecology. Journal of Agricultural, Biological and Environmental Statistics, 26, 446-463.
Github repository: https://github.com/TheoMichelot/smoothSDE/

answered Mar 28 at 20:52

Théo Michelot

312 bronze badges

$\begingroup$ thanks theo! in these references they describe SDEs. is there something for simpler stochastic processes like the ones I described? $\endgroup$

stats_noob
– stats_noob

2025-03-28 22:13:40 +00:00
Commented Mar 28 at 22:13
$\begingroup$ I think the model you described could actually be written as a mixed effect Gaussian regression model directly (and fitted using lme4/glmmTMB/brms). But it seems that, to create the types of paths that you showed, you would need to constrain $k \in (0, 1)$, which makes this a little harder and might require custom implementation. $\endgroup$

Théo Michelot
– Théo Michelot

2025-03-31 12:51:56 +00:00
Commented Mar 31 at 12:51

Add a comment |

mlofton · Accepted Answer · 2025-02-16 11:22:10Z

This is not an answer but there is more space here so I figured it's best to use it. This may help you slightly as far as simplifying your framework. I read your setup again and noticed that the first part of your model can be written as the following (aside from when $X_t$ is negative):

$X_{t} = (1-k) \times X_{t-1} + \epsilon_t$

where $\epsilon_t \sim N(0,\sigma^2)$

Note that the model above can be also be viewed as an AR(1) with AR coefficient $(1-k)$. Maybe this helps you slightly.

At the same time, I would still check out the koyck distributed lag. It achieves similar reversion type behavior. Note that the koyck distributed lag does not handle the case for the negative response in your model, namely that if the previous response is negative, then the next value is zero. I guess that is to keep the response greater than or equal to zero for all $t$ ?

P.S. : Also, in your third dot, for $X_{t} > 0$, it should be $\epsilon_{t+1}$ rather than $\epsilon_t$ because the error term should be aligned with the response.

Stack Exchange Network

Stochastic process based on mixed effects regression

2 Answers 2

Your Answer

Linked

Hot Network Questions

Stochastic process based on mixed effects regression

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Hot Network Questions