2
$\begingroup$

I am trying to run a difference-in-difference analysis in R. My data is non-panel, so I am reliant on a TWFE model where I have groups of individuals who are treated or untreated in periods or not. Treatment delivery is staggered, such that

My basic event study model would look like the following, in R's syntax, using fake data from the mtcars dataset for reproducibility:

library(fixest)
library(tidyverse)
library(fixest)
set.seed(123)

mtcars$time_til <- sample(-10:10, nrow(mtcars), replace = TRUE)

mtcars = mtcars |> 
  mutate(treat = case_when(time_til >= 0 ~ TRUE,
                           time_til < 0 ~ FALSE)) |> 
  mutate(am_factor = factor(am))

feols(fml = mpg ~ am_factor + am_factor:(time_til) + disp + 
       disp:(time_til) | vs, vcov = ~ cyl, data = mtcars)

)

where mpg is the continuous outcome variable and time_til represents the number of years until treatment, such 0 is the first year of treatment, positive numbers are the number of consecutive treatment years beyond that initial treatment, and negative numbers are the number of years prior to treatment. disp is a continuous control variable interacted with the time_til treatment and am_factor is my variable for a respondent group. vs is a country-year fixed effect, allowing estimation of the DiD, despite the data being cross-sectional, since there are no repeat observations for the same unit. My treatment is assumed to be non-absorbing. cyl takes the place of a continent-year variable on which I cluster.

The estimate of interest is the difference in the did (post-treatment minus pre-treatment) effect between groups of respondents in the different levels of am_factor. I don't care about the overall effect of time_til (treatment) on mpg, I care about how much more the effect is for certain groups of the factor interacted with treatment than others. Similarly, I want to ensure there is a parallel trend between members of the different am_factor levels, pre-treatment. To be clear, I want to see how the average treatment effect differs between units in the different factor levels.

I am wondering what R package and difference-in-difference estimator can handle this kind of estimation approach. I have tried multiple, and know there is an abundance of different DiD approaches being used (LP-DiD, Imputation [Borusyak et al], Gardiner, etc.) however I have yet to find one that appears to satisfy all of the conditions of my empirical question. To my understanding (and do correct me if I am mistaken), some packages do not accept control variables, others cannot handle an interaction term with the treatment to calculate treatments by group, others cannot handle cross-sectional data at all and require panel data, and others still cannot handle staggered treatment adoption.

Can anyone recommend a DiD estimator and R package equipped for this task? Have I made a mistake in specifying the model?

$\endgroup$
2
  • $\begingroup$ I think it's difficult to estimate treatments effects by time groups without strong assumptions so you might have to give that up. Otherwise, I think most of the modern packages handle control variables, cross-sectional data, and staggered treatment. Maybe start here: kylebutts.com/papers/did2s $\endgroup$ Commented May 14, 2024 at 9:58
  • $\begingroup$ The use of R is incidental here. Please focus on the question of what estimators you need. $\endgroup$ Commented May 23, 2024 at 14:56

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.