16
$\begingroup$

I am fairly new to R. I have attempted to read up on time series analysis and have already finished

  1. Shumway and Stoffer's Time series analysis and its applications 3rd Edition,
  2. Hyndman's excellent Forecasting: principles and practice
  3. Avril Coghlan's Using R for Time Series Analysis
  4. A. Ian McLeod et al Time Series Analysis with R
  5. Dr. Marcel Dettling's Applied Time Series Analysis

Edit: I'm not sure how to handle this but I found a usefull resource outside of Cross Validated. I wanted to include it here in case anyone stumbles upon this question.

Segmented regression analysis of interrupted time series studies in medication use research

I have a univariate time series of the number of items consumed (count data) measured daily for 7 years. An intervention was applied to the study population at roughly the middle of the time series. This intervention is not expected to produce an immediate effect and the timing of the onset of effect is essentially unknowable.

Using Hyndman's forecast package I have fitted an ARIMA model to the pre-intervention data using auto.arima(). But I am unsure of how to use this fit to answer whether there has been a statistically significant change in trend and quantify the amount.

# for simplification I will aggregate to monthly counts
# I can later generalize any teachings the community supplies
count <- c(2464, 2683, 2426, 2258, 1950, 1548, 1108,  991, 1616, 1809, 1688, 2168, 2226, 2379, 2211, 1925, 1998, 1740, 1305,  924, 1487, 1792, 1485, 1701, 1962, 2896, 2862, 2051, 1776, 1358, 1110,  939, 1446, 1550, 1809, 2370, 2401, 2641, 2301, 1902, 2056, 1798, 1198,  994, 1507, 1604, 1761, 2080, 2069, 2279, 2290, 1758, 1850, 1598, 1032,  916, 1428, 1708, 2067, 2626, 2194, 2046, 1905, 1712, 1672, 1473, 1052,  874, 1358, 1694, 1875, 2220, 2141, 2129, 1920, 1595, 1445, 1308, 1039,  828, 1724, 2045, 1715, 1840)
# for explanatory purposes
# month <- rep(month.name, 7)
# year <- 1999:2005
ts <- ts(count, start(1999, 1))
train_month <- window(ts, start=c(1999,1), end = c(2001,1))
require(forecast)
arima_train <- auto.arima(train_month)
fit_month <- Arima(train_month, order = c(2,0,0), seasonal = c(1,1,0), lambda = 0)
plot(forecast(fit_month, 36)); lines(ts, col="red")

Are there any resources specifically dealing with interrupted time series analysis in R? I have found this dealing with ITS in SPSS but I have not been able to translate this to R.

$\endgroup$
5
  • $\begingroup$ Do you want to do inference on whether the intervention had a statistically significant effect, or do you want to model the intervention to obtain better forecasts? And could you possibly make the data available? $\endgroup$ Commented Dec 2, 2015 at 20:08
  • $\begingroup$ @StephanKolassa Certainly! My aim is to do inference. I will provide dummy data in an Edit to better illustrate my point. $\endgroup$ Commented Dec 2, 2015 at 20:14
  • $\begingroup$ @StephanKolassa Data provided to the best of my abilities. $\endgroup$ Commented Dec 2, 2015 at 20:35
  • $\begingroup$ Previous research suggests the intervention affect to be on the scale of +/- 5% change. $\endgroup$ Commented Dec 2, 2015 at 20:37
  • $\begingroup$ @StephanKolassa Provided actual usable data $\endgroup$ Commented Dec 2, 2015 at 22:51

4 Answers 4

4
$\begingroup$

This is known as change-point analysis. The R package changepoint can do this for you: see the documentation here (including references to the literature): http://www.lancs.ac.uk/~killick/Pub/KillickEckley2011.pdf

$\endgroup$
3
  • $\begingroup$ Thank you. I am looking into this. As far as I can tell this would calculate possible change points in the series, but will not analyze the trend difference. I apologize if this assumption is incorrect I have not been able to review the package other than superficially. $\endgroup$ Commented Dec 3, 2015 at 13:22
  • $\begingroup$ After identifying the change point, you can split the data into two time series (before and after the change point) and estimate the parameters of the two time series separately. A couple more suggestions: as your data has strong seasonal trend, this should be removed prior to the change-point analysis; and if you are going to use an ARIMA model, then differencing should also be performed prior to the change-point analysis (or, alternatively, you'll need to use some more specialized procedure). $\endgroup$ Commented Dec 3, 2015 at 15:59
  • $\begingroup$ Thank you for your suggestions I will attempt to implement and will mark as "answered" if this solves the problem. $\endgroup$ Commented Dec 3, 2015 at 20:11
2
$\begingroup$

I would suggest a repeated measures hierarchical model. This method should provide robust results since each individual will act as his/her own control. Try checking out this link from UCLA.

$\endgroup$
0
2
$\begingroup$

A couple professors from ASU have a nice resource for interrupted time series in R that I found helpful. They provide reproducible examples for modeling the effect of a policy intervention.

The book is called Foundations of Program Evaluation: Regression Tools for Impact Analysis, and it is available for free on github.

$\endgroup$
3
  • $\begingroup$ While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review $\endgroup$ Commented Aug 4, 2022 at 20:03
  • $\begingroup$ @KarolisKoncevičius fair enough. I will try to think of what I can add besides the link. Otherwise, I will delete the answer. $\endgroup$ Commented Aug 4, 2022 at 20:20
  • 1
    $\begingroup$ This is a good resource. $\endgroup$ Commented Feb 23, 2024 at 14:39
1
$\begingroup$

For a Bayesian approach, you can use mcp to fit a Poisson or Binomial model (because you have counts from fixed-interval periods) with autoregression applied to the residuals (in the log space). Then compare a two-segment model to a one-segment model using cross-validation.

Before we start, note that for this dataset, this model does not fit well and cross-validation looks unstable. So I would refrain from using the following in high-stakes scenarios, but it illustrates a general approach:

# Fit the change point model
library(mcp)
model_full = list(
  count ~ 1 + ar(1),  # intercept and AR(1)
  ~ 1  # New intercept
)
fit_full = mcp(model_full, data = df, family = poisson(), par_x = "year")


# Fit the null model
model_null = list(
  count ~ 1 + ar(1)  # just a stable AR(1)
)
fit_null = mcp(model_null, data = df, family = poisson(), par_x = "year")

# Compare predictive performance using LOO cross-validation
fit_full$loo = loo(fit_full)
fit_null$loo = loo(fit_null)
loo::loo_compare(fit_full$loo, fit_null$loo)

For the present dataset, this results in

       elpd_diff se_diff
model2    0.0       0.0 
model1 -459.1      64.3 

I.e., an elpd_diff/se_diff ratio of around 7 in favor of the null model (no change). Possible improvements include:

  • modeling the periodical trend using sin() or cos().
  • adding prior information about the likely location of the change, e.g., prior = list(cp_1 = dnorm(1999.8, 0.5).

Read more about modeling autoregression, doing model comparison, and setting priors the mcp website. Disclosure: I am the developer of mcp.

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.