How to find confidence intervals for binary outcome probability?

Question

I'm doing stats for medical chart review research. The binary outcomes vary in probability from less than 0.05 to greater than 0.5 depending on risk factors. For relatively more common outcomes like bronchopulmonary dysplasia (BPD), roughly >20% of very preterm neonates develop BPD. The continuous predictor is hours after birth at which first feed was given. The overall goal is to assess for explanatory associations between continuous explanatory variable "time (hours after birth) at which first enteral feed is given" and subsequent outcome variables i.e. necrotizing enterocolitis, sepsis, retinopathy of prematurity, and bronchopulmonary dysplasia (outcomes are composited with mortality to prevent survivorship bias). This assessment is made using multivariate logistic regression with covariate potential confounders. However, prior to the multivariate model, our goal is to visually describe the univariate relationship between time until first feed and outcomes.

Initially, I used a GAM to assess for linear relationship between continuous explanatory variable and adjusted log odds of outcome, because I read that this linearity is an assumption of logistic regression. The GAM confidence intervals were asymmetric about the curve, with more variance toward p of 0.5 and less toward extremes, which seemed consistent with my understanding of an equation I read: variance maximized when p = 0.5 as variance is proportional to p / (1-p). However, the GAM curve was not very descriptive of the empirical probability (see image below where GAM plot is bottom panel, noting how the non-linearity of univariate relationship was suppressed). My team then suggested I use LOESS to describe the relationship and variance.

So I made LOESS plots of empirical probability on y-axis by time until first feed on x-axis. In R, I'm using the geom_smooth(method = "loess") function to make a LOESS plot and display local/pointwise 95% CIs around the smoothed probability.

I then read that the geom_smooth built-in 95% confidence interval setting (se = TRUE) uses a global measure of variance (with local leverage), which I thought might be invalid given that the outcome is binary (and so variance would be dependent on probability, which varies by x). This initial LOESS plot example is Panel B of attached image. Because I was concerned it was invalid, I then plotted with bootstrapped LOESS CIs instead (Panel D of attached image). However, what I noticed is that the bootstrapped CIs and the analytical CIs (from se=TRUE) are very similar to each other. Are either of these approaches valid?

Edit: Per suggestion of EdM in comments, I made a plot with splines as well. I used univariate logistic regression with restricted cubic splines, boundaries at 5th and 95th and internal splines at 35th and 65th percentile. This RCS plot is also attached in the right panel of image below. This plot appears to preserve the non-linearity of the univariate relationship (so it's more descriptive) while also having confidence intervals that are asymmetric (greater toward 0.5 and lesser toward 0 and 1).

When I looked at stats paper (doi: 10.1214/aos/1015362189), the methods discussed were totally different (Wilson, Jeffreys, Agresti-Coull, Clopper-Pearson), which I don't currently understand. Not sure if these are applicable to finding pointwise CIs. My question is if using the LOESS built-in analytical or LOESS bootstrapped CIs are valid at all, or if I need to learn about other methods in order to get valid CIs for probability of binary outcome.

Welcome to Cross Validated! The 4 types of confidence intervals you note in the last paragraph might not be very relevant to your situation. Please edit the question to provide some more details about the binary outcome (in particular, the ranges of typical probabilities), the continuous predictor variable, and the goal of your project. It seems that what you are interested in is how well the continuous variable can predict the binary outcome. If that's the case, you might be better served by a parametric logistic regression model that models the continuous variable with a spline. — EdM
– EdM, Commented Oct 12 at 13:58
@EdM Thank you for the response! I edited the question with more details about the research project. The binary outcomes vary in probability from less than 0.05 to greater than 0.5 depending on risk factors. For relatively more common outcomes like bronchopulmonary dysplasia (BPD), roughly >20% of very preterm neonates develop BPD. The continuous predictor is hours after birth at which first feed was given. We are interested in how well this variable predicts binary outcomes. Thank you for the suggestion to use parametric logistic regression with splines! — syndromeofme
– syndromeofme, Commented Oct 12 at 17:07
@EdM Initially I used logistic GAM to assess for non-linearity of Logit outcome vs. time to feed. If there was visible curvature, I used logistic regression w/ RCS to test for nonlinearity with nested LRT (nonlinearity insignificant after covariates added) If I'm understanding you, I can use logistic regression w/ splines to display the univariate relationship of outcome vs. time to feed, along with CIs respecting binomial likelihood? I've attached image with each type of plot (LOESS w/ analytic CIs, LOESS w/ bootstrap CIs, univar logistic GAM, univar logistic w/ splines to question). — syndromeofme
– syndromeofme, Commented Oct 12 at 17:16

EdM · Accepted Answer · 2025-10-12 19:20:30Z

"[T]o visually describe the univariate* relationship between time until first feed and outcomes," any of the plots you show could be OK. Chapter 7 of An Introduction to Statistical Learning includes LOESS, a spline and a generalized additive model (GAM) as ways to move beyond linearity. Note that a regression spline is just one type of GAM, so you might want to see how modeling via the GAM function you used differed from a spline.

The confidence intervals (CI) in these types of plots represent the variance around the point estimates, variance arising from uncertainty in the parameter values. In your case they don't include the inherent binomial variance around those point estimates, just like CI in linear regression don't include the residual variance that increases the uncertainty in any single future observation (represented by prediction intervals). See this page for the distinction between confidence intervals and prediction intervals.

The details of the CI in this first step of your analysis don't matter much, anyway, as you presumably won't be trying to perform inference until you have built the complete multiple regression model(s).

Be warned that these types of single-predictor plots don't necessarily represent what's found when other predictors are taken into account. In a comment, for example, you note that the apparent non-linearity seen in the single-predictor plot became insignificant when other predictors were taken into account. In that case, you should trust the multiple-regression result, not the single-predictor result. This is a particular problem in binary regression, in which omission of any outcome-associated predictor can lead to bias in estimates for included predictors. See this page.

I'm a little worried about your use of "time (hours after birth) at which first enteral feed is given" as a predictor in this context. I suspect (without knowing much about this) that a premature neonate recognized as very high risk at birth would undergo extra early procedures that might delay enteral feeding. If that's the case, then you end up with a type of survivorship bias. Make sure, based on your understanding of the subject matter, that you don't have that problem.

*Generally accepted (but not universal) practice is to reserve "multivariate" for having multiple outcomes and to use "multiple regression" for a single outcome modeled as a function of multiple predictors. See this page. Thus you do have a "multivariate" study with your multiple types of outcomes, but a model only of (BPD + death) with multiple predictors ideally is called a "multiple logistic regression" instead of "multivariate."

You're amazing! You're right it's just descriptive goal here. Although my research group has used binary outcome LOESS CIs to determine groups used for inferential statistics. I.e. the point at which lower bound of pointwise LOESS CI crosses 0/x-intercept is used to delineate groups for comparison. For example, LOESS CIs used to create a categorical factor of group 1 vs. group 2, which is then included in a multiple regression model. — syndromeofme
– syndromeofme, Commented Oct 12 at 23:01
In our models we include first-week med use (pressors) as proxy for illness severity. Problem: outcomes such as sepsis occur in week 1. So outcome may precede meds. Median first feed< 24 hours, so it's days after the exposure of interest before the severity observation period ends. Problematic we think because perhaps meds is a mediator: feed time -> meds -> outcome. And perhaps: feed time -> outcome -> meds. Our group is planning on collecting data in order to create severity of illness markers that are defined earlier e.g. nSOFA (by 72 hours) or CRIB II (1 hour after birth). — syndromeofme
– syndromeofme, Commented Oct 12 at 23:10
One hang-up I have: LOESS/bootstrap LOESS confidence intervals have a distinct, predominantly symmetric shape. Compared to the confidence intervals based on logistic GAM, which are asymmetric such that the range between the estimate and the bound closer to 0.5 is larger, compared to the range between the estimate and bound closer to 0 or 1. Since these methods generate distinct intervals, is there one that we should prefer using? — syndromeofme
– syndromeofme, Commented Oct 12 at 23:11
@syndromeofme Don't use LOESS CIs "to create a categorical factor of group 1 vs. group 2, which is then included in a multiple regression model." First, it's a bad idea to categorize a continuous predictor; see this page. Second, CIs from the single-predictor plot don't always represent the situation when other predictors are taken into account. Third, the CI width doesn't represent anything fundamental about the population of interest; it gets narrower as the sample size for the plot increases, changing the cutoff point. — EdM
– EdM, Commented Oct 13 at 13:33
@syndromeofme I'd recommend that you find a local statistician to help with this project, given the complexity of the clinical situation and the difficulty in specifying causal relationships. This web site can provide basic pointers, but there is so much more going on in your study that you should enlist someone who can go over all the details with you and help devise a comprehensive design and analysis strategy. — EdM
– EdM, Commented Oct 13 at 13:38

Stack Exchange Network

How to find confidence intervals for binary outcome probability?

1 Answer 1

Your Answer

Linked

Hot Network Questions

How to find confidence intervals for binary outcome probability?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Hot Network Questions