Skip to main content

Questions tagged [calibration]

Calibration can refer to adjustment of measurements to agree with value of some standard; to transform classifier scores into class membership probabilities; etc. Do not use for predicting an explanatory variable from an observation of the dependent variable, for that use the tag inverse-prediction.

Filter by
Sorted by
Tagged with
2 votes
0 answers
68 views

I would like to evaluate how well two experimental designs perform with the goal of parameter estimation. I'm generating 1000 simulated datasets for each design and fitting the same model to all of ...
mkt's user avatar
  • 21.2k
2 votes
1 answer
34 views

I report model performance using log loss on calibrated probabilities, where calibration is temperature scaling fitted on train-only out-of-fold (OOF) predictions. For hyperparameter tuning, should ...
randomstate42's user avatar
5 votes
2 answers
95 views

I am examining the effect of (right-)censoring on the D-calibration of survival data sets. The data sets are completely synthetic, generated by R package coxed. I use my own code (not the package's ...
statistischegent's user avatar
3 votes
0 answers
93 views

I am trying to run the Breusch-Pagan test manually in RStudio from a weighted linear model (wi = 1/x^2). I need help verifying whether the following rationale is correct: What I did: WLS and residuals ...
finattisaka's user avatar
11 votes
3 answers
731 views

Background I trained an XGBoost model to predict a dichotomous outcome, which has a base rate of about 55% in the overall sample. This model will not be used to classify, however: It will be used ...
Mark White's user avatar
  • 11.7k
0 votes
0 answers
27 views

I am running a Cox regression model to examine the predictive performance of a model using one variable (a risk assessment score) to predict days to crime. As part of the modeling, we are examining ...
Will L Xu's user avatar
1 vote
0 answers
94 views

The Question: what are the best practices for raking the survey weights of a specific subsample while leaving the remaining sample (relatively) unchanged for later comparative analyses? The Situation: ...
Peter T's user avatar
  • 11
2 votes
0 answers
114 views

I am currently creating two logistic regression models (one with forward selection and one with LASSO) using R to predict whether a patient has a malignant or benign breast cancer from this dataset: ...
Leo_Miche's user avatar
1 vote
1 answer
94 views

I am looking to apply a calibration/correction approach on a set of sensors and I just wanted to know that the approach I am going to use is statistically correct and acceptable. I am using a set of ...
Milad's user avatar
  • 157
1 vote
0 answers
76 views

I’m working on species distribution modeling with binary data (presence / absence, 1 / 0). My target species is extremely rare (prevalence ~0.014), so my dataset is almost all zeros and just a handful ...
LolaRT96's user avatar
2 votes
1 answer
123 views

I am looking to externally validate the calibration of a model someone else developed using a different dataset. The original model was developed by using linear regression, and then the weights were ...
Pink Flamingos's user avatar
0 votes
0 answers
81 views

I am familiar with SHAP and often use it when developing or assessing ML models. I want to use SHAP in a new context. I'm working on a project that relies on an XGBoost Classifier, which outputs ...
odd's user avatar
  • 1
1 vote
1 answer
59 views

I am trying to calibrate the predicted probabilities using isotonic regression for binary outcome model in R. I know that calibrating probabilities should not change the AUC. But the following R ...
Phoebe's user avatar
  • 163
1 vote
0 answers
42 views

If I am using a GridSearchCV to find hyper parameters on a training set; if I were to run a CalibriatedClassifierCV to tune my probabilities, would it suffice to fit the CalibraitedClassifierCV with ...
user54565's user avatar
6 votes
1 answer
1k views

It appears that isotonic regression is a popular method to calibrate models. I understand that isotonic guarantees a monotonically increasing or decreasing fit. However, if you can get a smoother fit, ...
SAS2Python's user avatar
4 votes
3 answers
517 views

If we train a deep learning model with cross entropy loss, we expect the model has a low cross entropy loss. Is there any way to train model to make the model get a small expected calibration error,...
Bayesian Hat's user avatar
0 votes
0 answers
90 views

I'm studying which variant of variational autoencoders (VAE) gives better expected calibration error (ECE) (see also this doc) under small dataset. According to google's tuning playbook, to compare ...
Kaiwen's user avatar
  • 307
2 votes
1 answer
505 views

I'm trying to evaluate my parametric proportional hazards and accelerated failure models in terms of calibration. There seem to be many ways to summarise calibration, but they give a single summary ...
Wojty's user avatar
  • 165
7 votes
2 answers
174 views

In a situation where a binary variable is of interest and we want to predict the probability of either event (dog vs cat, say), it is common to talk about the calibration of the predictions, if the ...
Dave's user avatar
  • 72.9k
5 votes
3 answers
617 views

I am performing a large number of Welch's t-tests (t-test with unequal variance) on very small sample sizes, often with only two samples per condition. I am finding the p-values are poorly calibrated: ...
emarti's user avatar
  • 151
0 votes
0 answers
118 views

I have fit a cox regression model, and used val.surv function to plot calibration plot to compare observed survival probability with predicted survival probability. ...
Xixuan Zhu's user avatar
6 votes
1 answer
251 views

There is some interesting behavior in the pROC::roc function in R. ...
Dave's user avatar
  • 72.9k
1 vote
1 answer
91 views

Let's say that, with a measurement device, we have a linear relationship between an output measurement I (in mV) and the concentration ...
Basj's user avatar
  • 632
7 votes
1 answer
630 views

From what I understood by reading sklearn Probability Calibration, when we run CalibratedClassifierCV we will fit "a regressor (called a calibrator) that maps the output of the classifier (as ...
andy mot's user avatar
1 vote
0 answers
446 views

I'm performing multiclass probability prediction using CatBoostClassifier on a dataset with ~4000 rows, 13 features, 4 target classes. Dataset has outliers, but it is balanced. For this task I'm using ...
primadonna's user avatar
7 votes
1 answer
170 views

A 2020 NeuroIPS paper by Gupta, Podkopaev & Ramdas addresses the calibration of outputs to binary “classification” models, admitting that the raw scores, despite perhaps being on $\left[0, 1\right]...
Dave's user avatar
  • 72.9k
1 vote
1 answer
81 views

I am considering showing how mis-calibrated a cox proportional hazard model is by plotting the 10th percentiles of risk on the x axis vs the incidence per 100,000. For each bin in x I could plot data ...
brucezepplin's user avatar
2 votes
1 answer
445 views

I have a balanced dataset where each object (song) has one of the four target class labels (mood of a song). Example: ID feature1 feture2 feature3 target_class 0 0.5 0.11 125 upbeat 1 0.23 0.75 136 ...
primadonna's user avatar
3 votes
2 answers
258 views

Consider a model that predicts the probability of some binary event $Y$ (potentially given some features $X$). Denote the estimated probability of $Y$ occurring as $\hat{p}$. One possible choice for a ...
ischmidt20's user avatar
0 votes
1 answer
207 views

I'm a bit confused around calibration in the large. I usually see it discussed in the context of binary outcomes, but am I correct in thinking it can also be valuable as a part of external validation ...
JeffR's user avatar
  • 1
3 votes
1 answer
268 views

Guo et al (ICML 2017) state the following. During training, after the model is able to correctly classify (almost) all training samples, NLL can be further minimized by increasing the confidence of ...
Dave's user avatar
  • 72.9k
2 votes
0 answers
85 views

I have inherited a long-running survey with two measures of individual behavior. Edits: clarifying that this is not about drinking behavior; it’s not, and I only used that to try and illustrate. It ...
dholstius's user avatar
  • 101
1 vote
1 answer
101 views

I have extracted predicted probabilities (logistic model) from a graph according to the nine classes of a certain variable (I don't own the model). I need to compare the predicted probabilities, that ...
vixxovs's user avatar
  • 45
1 vote
0 answers
332 views

I'm training a pretty standard LightGBM regressor and noticing a strange pattern with the residuals (see images below--I'm bunching the predicted values and taking the observed average for the group). ...
dfried's user avatar
  • 201
4 votes
0 answers
169 views

The mean squared error has a famous decomposition into bias and variance. $$ \text{MSE} = \text{bias}^2 + \text{var} $$ Brier score is also a mean squared error calculation, and Brier score has a ...
Dave's user avatar
  • 72.9k
1 vote
0 answers
83 views

In his Is Medicine Mesmerized by Machine Learning? blog article, Frank Harrell shows a calibration curve (below) and states that it is quite poor. I follow the logic: the claimed probability of $0.20$...
Dave's user avatar
  • 72.9k
1 vote
0 answers
104 views

I am trying to calibrate a Bayesian neural network. I have already approximated the posterior density for its weights. In order to make predictions the Bayesian way, I am taking samples from the ...
Randomdude's user avatar
1 vote
1 answer
56 views

I am trying to calibrate a Heston Model with 100 call options using this paper https://arxiv.org/pdf/1511.08718.pdf. In algorithm 4.1 on page 18, they define the dampening factor as: $$\mu_0 = \omega \...
THATS MY QUANT MY QUANTITATIVE's user avatar
0 votes
0 answers
61 views

I have two proxy metrics, and I'd like to see which of them correlates more strongly with human ratings. I have ~30 questions, and for each question 3 humans independently give a score on a 1-10 scale....
augray's user avatar
  • 101
2 votes
1 answer
317 views

I am currently using XGBoost (in R) to perform multiclass classification. I am using merror=eval_metric and my objective is <...
HeyCool08's user avatar
  • 125
2 votes
1 answer
203 views

Let's say we have some binary variable of interest and fit a model to predict the probability of the two classes, say a logistic regression or a "classification" neural network. This model ...
Dave's user avatar
  • 72.9k
1 vote
0 answers
240 views

I work with a labelled tabular dataset of about 1 million observations, with the target being binary. The dataset is heavily imbalanced - about 0.5% positive class. I have trained a gradient boosting ...
StrLdn's user avatar
  • 11
5 votes
0 answers
265 views

I wanted to assess the performance of my lightGBM classifier using a calibration plot. If I understood correctly, a calibration plot visualizes the alignment between the predicted probabilities by the ...
Programming Noob's user avatar
1 vote
0 answers
111 views

I have a labelled data set with $n$ data points $(x_i, y_i)$ with $x_i \in \mathbb{R}^k$ and $y_i \in \mathbb{R}$ and I trained a model $f: \mathbb{R}^k \to \mathbb{R} \times \mathbb{R}^+$ on some of ...
PascalIv's user avatar
  • 921
1 vote
0 answers
62 views

Can I get standard error from a constrained optimization problem in R? I have calculated transition probabilities. Now I am trying to calibrate it. Using these transition probabilities I have ...
Md. Zubab Ibne Moid's user avatar
0 votes
1 answer
262 views

I would like to know how to do calibration plot with Hosmer-lemeshow test and AUC for ROC curve after multiple imputation in R. I build one prediction model and tried to do model performance but ...
Haruka Hayashi's user avatar
1 vote
0 answers
204 views

Why in Shrinkage, due to an overfitted prediction model, do we tend to overestimate risk for "high risk" subjects and to underestimate risk for "low risk" subjects ? Intuitively I ...
vixxovs's user avatar
  • 45
1 vote
1 answer
305 views

Is it a reasonable approach to train a probabilities classifier by optimizing a threshold-independent metric such as AUC, and then using the trained classifier to calibrate the decision threshold ...
Amit S's user avatar
  • 77
0 votes
0 answers
115 views

Similar to ths question I would like to know how to create a calibration curve without binning my predictions. What makes my situation different, is that I'm using icenReg for my interval-censored ...
Wojty's user avatar
  • 165
1 vote
1 answer
148 views

I have an instrument that measures a value. It is only possible to measure the value once i.e. the experiment can't be repeated (think recording a car's speed as it drives past). The instrument is not ...
Chuck's user avatar
  • 91

1
2 3 4 5
7