Resources for Interrupted time series analysis in R

Question

I am fairly new to R. I have attempted to read up on time series analysis and have already finished

Shumway and Stoffer's Time series analysis and its applications 3rd Edition,
Hyndman's excellent Forecasting: principles and practice
Avril Coghlan's Using R for Time Series Analysis
A. Ian McLeod et al Time Series Analysis with R
Dr. Marcel Dettling's Applied Time Series Analysis

Edit: I'm not sure how to handle this but I found a usefull resource outside of Cross Validated. I wanted to include it here in case anyone stumbles upon this question.

Segmented regression analysis of interrupted time series studies in medication use research

I have a univariate time series of the number of items consumed (count data) measured daily for 7 years. An intervention was applied to the study population at roughly the middle of the time series. This intervention is not expected to produce an immediate effect and the timing of the onset of effect is essentially unknowable.

Using Hyndman's forecast package I have fitted an ARIMA model to the pre-intervention data using auto.arima(). But I am unsure of how to use this fit to answer whether there has been a statistically significant change in trend and quantify the amount.

# for simplification I will aggregate to monthly counts
# I can later generalize any teachings the community supplies
count <- c(2464, 2683, 2426, 2258, 1950, 1548, 1108,  991, 1616, 1809, 1688, 2168, 2226, 2379, 2211, 1925, 1998, 1740, 1305,  924, 1487, 1792, 1485, 1701, 1962, 2896, 2862, 2051, 1776, 1358, 1110,  939, 1446, 1550, 1809, 2370, 2401, 2641, 2301, 1902, 2056, 1798, 1198,  994, 1507, 1604, 1761, 2080, 2069, 2279, 2290, 1758, 1850, 1598, 1032,  916, 1428, 1708, 2067, 2626, 2194, 2046, 1905, 1712, 1672, 1473, 1052,  874, 1358, 1694, 1875, 2220, 2141, 2129, 1920, 1595, 1445, 1308, 1039,  828, 1724, 2045, 1715, 1840)
# for explanatory purposes
# month <- rep(month.name, 7)
# year <- 1999:2005
ts <- ts(count, start(1999, 1))
train_month <- window(ts, start=c(1999,1), end = c(2001,1))
require(forecast)
arima_train <- auto.arima(train_month)
fit_month <- Arima(train_month, order = c(2,0,0), seasonal = c(1,1,0), lambda = 0)
plot(forecast(fit_month, 36)); lines(ts, col="red")

Are there any resources specifically dealing with interrupted time series analysis in R? I have found this dealing with ITS in SPSS but I have not been able to translate this to R.

Do you want to do inference on whether the intervention had a statistically significant effect, or do you want to model the intervention to obtain better forecasts? And could you possibly make the data available? — Stephan Kolassa
– Stephan Kolassa, Commented Dec 2, 2015 at 20:08
@StephanKolassa Certainly! My aim is to do inference. I will provide dummy data in an Edit to better illustrate my point. — dais.johns
– dais.johns, Commented Dec 2, 2015 at 20:14
Previous research suggests the intervention affect to be on the scale of +/- 5% change. — dais.johns
– dais.johns, Commented Dec 2, 2015 at 20:37

Brent Kerby · Accepted Answer · 2015-12-02 23:30:17Z

4

This is known as change-point analysis. The R package changepoint can do this for you: see the documentation here (including references to the literature): http://www.lancs.ac.uk/~killick/Pub/KillickEckley2011.pdf

answered Dec 2, 2015 at 23:30

Brent Kerby

2,57314 silver badges11 bronze badges

$\begingroup$ Thank you. I am looking into this. As far as I can tell this would calculate possible change points in the series, but will not analyze the trend difference. I apologize if this assumption is incorrect I have not been able to review the package other than superficially. $\endgroup$

dais.johns
– dais.johns

2015-12-03 13:22:51 +00:00
Commented Dec 3, 2015 at 13:22
$\begingroup$ After identifying the change point, you can split the data into two time series (before and after the change point) and estimate the parameters of the two time series separately. A couple more suggestions: as your data has strong seasonal trend, this should be removed prior to the change-point analysis; and if you are going to use an ARIMA model, then differencing should also be performed prior to the change-point analysis (or, alternatively, you'll need to use some more specialized procedure). $\endgroup$

Brent Kerby
– Brent Kerby

2015-12-03 15:59:55 +00:00
Commented Dec 3, 2015 at 15:59
$\begingroup$ Thank you for your suggestions I will attempt to implement and will mark as "answered" if this solves the problem. $\endgroup$

dais.johns
– dais.johns

2015-12-03 20:11:24 +00:00
Commented Dec 3, 2015 at 20:11

Add a comment |

KT12 · Accepted Answer · 2018-09-10 17:18:12Z

2

I would suggest a repeated measures hierarchical model. This method should provide robust results since each individual will act as his/her own control. Try checking out this link from UCLA.

edited Sep 10, 2018 at 17:18

KT12

2452 silver badges11 bronze badges

answered Dec 3, 2015 at 9:35

kblansit

613 bronze badges

Add a comment |

Mel G · Accepted Answer · 2022-08-04 19:47:11Z

2

A couple professors from ASU have a nice resource for interrupted time series in R that I found helpful. They provide reproducible examples for modeling the effect of a policy intervention.

The book is called Foundations of Program Evaluation: Regression Tools for Impact Analysis, and it is available for free on github.

edited Aug 4, 2022 at 19:47

answered Aug 4, 2022 at 19:38

Mel G

212 bronze badges

$\begingroup$ While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review $\endgroup$

Karolis Koncevičius
– Karolis Koncevičius

2022-08-04 20:03:20 +00:00
Commented Aug 4, 2022 at 20:03
$\begingroup$ @KarolisKoncevičius fair enough. I will try to think of what I can add besides the link. Otherwise, I will delete the answer. $\endgroup$

Mel G
– Mel G

2022-08-04 20:20:31 +00:00
Commented Aug 4, 2022 at 20:20
1

$\begingroup$ This is a good resource. $\endgroup$

SanMelkote
– SanMelkote

2024-02-23 14:39:56 +00:00
Commented Feb 23, 2024 at 14:39

Add a comment |

Jonas Lindeløv · Accepted Answer · 2020-01-15 08:20:26Z

For a Bayesian approach, you can use mcp to fit a Poisson or Binomial model (because you have counts from fixed-interval periods) with autoregression applied to the residuals (in the log space). Then compare a two-segment model to a one-segment model using cross-validation.

Before we start, note that for this dataset, this model does not fit well and cross-validation looks unstable. So I would refrain from using the following in high-stakes scenarios, but it illustrates a general approach:

# Fit the change point model
library(mcp)
model_full = list(
  count ~ 1 + ar(1),  # intercept and AR(1)
  ~ 1  # New intercept
)
fit_full = mcp(model_full, data = df, family = poisson(), par_x = "year")


# Fit the null model
model_null = list(
  count ~ 1 + ar(1)  # just a stable AR(1)
)
fit_null = mcp(model_null, data = df, family = poisson(), par_x = "year")

# Compare predictive performance using LOO cross-validation
fit_full$loo = loo(fit_full)
fit_null$loo = loo(fit_null)
loo::loo_compare(fit_full$loo, fit_null$loo)

For the present dataset, this results in

       elpd_diff se_diff
model2    0.0       0.0 
model1 -459.1      64.3

I.e., an elpd_diff/se_diff ratio of around 7 in favor of the null model (no change). Possible improvements include:

modeling the periodical trend using sin() or cos().
adding prior information about the likely location of the change, e.g., prior = list(cp_1 = dnorm(1999.8, 0.5).

Read more about modeling autoregression, doing model comparison, and setting priors the mcp website. Disclosure: I am the developer of mcp.

Stack Exchange Network

Resources for Interrupted time series analysis in R

4 Answers 4

Your Answer

Linked

Hot Network Questions

Resources for Interrupted time series analysis in R

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Hot Network Questions