4
$\begingroup$

I am currently working on a longitudinal dataset in which I aim to cluster individuals based on the trajectory of a single continuous variable measured repeatedly across time (e.g., daily values). The goal is to identify distinct trajectory-based subgroups within the study population and subsequently compare these subgroups in terms of clinical or other downstream outcomes.

Initially, I considered using factor analysis and latent class/profile analysis (LCA/LPA), but I soon realized that these methods are typically designed for situations involving multiple observed variables, often for purposes such as dimensionality reduction or latent construct identification. While these methods might technically accommodate repeated measurements as separate variables, I am uncertain whether they are appropriate or statistically valid when applied to only one construct measured at multiple time points.

As I continued my search, I came across Latent Class Growth Analysis (LCGA) and Group-Based Trajectory Modeling (GBTM), which seem specifically designed for clustering individuals based on their longitudinal patterns of one specific variable. From what I understand, GBTM is essentially a special case of LCGA which assumes that error variance is the same for all classes and all time points. I also encountered Growth Mixture Modeling (GMM), which allows for within-class variability (i.e., random effects).

I believe that LCGA/GBTM or GMM are likely the most appropriate approaches for my aim, but as someone new to this modeling framework, I find the distinctions and assumptions somewhat overwhelming.

Could anyone with experience in trajectory-based clustering kindly provide guidance or point me toward best practices for:

◦ Choosing between LCGA, GBTM, and GMM

◦ Whether using LCA/LPA/FA with repeated measures (as separate indicators) is fundamentally flawed in this context

Any insights, references, or shared experiences would be greatly appreciated.

$\endgroup$
1
  • 1
    $\begingroup$ I have precisely zero experience in trajectory-based clustering. However, all clustering requires is a notion of distance between instances. And there are a number of such notions on the "similarity" of time series. There actually is quite some literature on "time series clustering". This may just be a case of very similar problems being known under quite different names. $\endgroup$ Commented Jun 2 at 16:34

2 Answers 2

2
$\begingroup$

It sounds like you may be looking for a dynamic factor model, which allows the latent factors to evolve as real-valued time series.

enter image description here

The loading coefficients that must be learned will allow your observed individuals to be clustered into latent sub-groups. This brief set of slides introduces dynamic factors and walks through a basic example of how to fit these models using the R package {mvgam}, but of course there are other software packages available that will estimate these models for you.

$\endgroup$
0
$\begingroup$

Here are some review papers on clustering longitudinal data:

https://onlinelibrary.wiley.com/doi/full/10.1002/sim.9917

https://www.tandfonline.com/doi/full/10.1080/03610918.2020.1861464

https://meth.psychopen.eu/index.php/meth/article/view/7143

$\endgroup$
1
  • $\begingroup$ This article might also be helpful to researchers with similar questions (especially table 1). dovepress.com/… $\endgroup$ Commented Jun 3 at 13:12

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.