Skip to main content

Questions tagged [accuracy]

Accuracy of an estimator is the degree of closeness of the estimates to the true value. For a classifier, accuracy is the proportion of correct classifications. (This second usage is not good practice. See the tag wiki for a link to further information.)

Filter by
Sorted by
Tagged with
3 votes
1 answer
118 views

I've encountered the term "accuracy" used differently across several evaluation contexts, and I want to clearly understand their mathematical and conceptual distinctions using consistent ...
Charlie Parker's user avatar
0 votes
2 answers
52 views

I currently have a RandomForestClassifier that is classifying workload based on fNIRS data. Our classification accuracy is about 49% I want to investigate why our classification accuracy is so bad and ...
Maddie Brower's user avatar
1 vote
1 answer
43 views

I’ve recently encountered two approaches used to express performance on perceptual tasks as d' when trying to convert (non-linear) accuracy on a 2AFC (2-alternative forced choice) task to a linear ...
My Work's user avatar
  • 153
7 votes
1 answer
197 views

Frederick Mosteller's 50 Challenging Problems in Probability has a nice question I have not seen before, and I was wondering whether it could be extended. 49. Doubling your accuracy An unbiased ...
Henry's user avatar
  • 45.4k
1 vote
0 answers
69 views

This is from another question here. The theorem below is from Lambert's paper about forecasting, (Elicitation and Evaluation of Statistical Forecasts): $\textbf{Proposition}\quad 1:$ Let $(\Theta = \{\...
Oliver Queen's user avatar
3 votes
1 answer
126 views

Consider binary classification, the geometric mean is defined as $\sqrt{\text{Precision} \times \text{Recall}} = \sqrt{ \frac{TP}{TP+FP} \times \frac{TP}{TP+FN} }$. But there can be different TP/FP/FN ...
user3236636's user avatar
1 vote
0 answers
75 views

I am interested in assessing the accuracy of raters to a reference standard for subjective ratings on a Likert scale from 1-10 as in: ...
Todd D's user avatar
  • 2,251
3 votes
1 answer
228 views

I am trying to calculate the concordance (c) statistic for a Royston-Parmar model. My model stratifies the baseline hazard and uses splines to model log(t). I am not sure If I am calculating the c-...
user29204473's user avatar
0 votes
0 answers
80 views

I have a dataset that has been split into 2 parts, train and test set. After training a model with the training set to classify between class 0 and 1, I used the sklearn roc_curve to calculate the ...
Eric Wang's user avatar
2 votes
1 answer
86 views

I am currently building a ML model for a binary classification problem. I am currently using a curated dataset that was provided in a research paper, that has been perfectly balanced. However, it is ...
I Noob's user avatar
  • 21
0 votes
0 answers
33 views

I am running a Mixture model and I have no free parameters, I just have it evaluating for a given datapoint, its likelihood of belonging to one cluster. Separately, I have a ground truth about these ...
lorenzo cappiello's user avatar
2 votes
3 answers
153 views

I have a simple model that produces forecast values. The model works on hourly data. Now, I am only interested in observations with flags. I would like to identify where the forecasts are ...
Lohengrin's user avatar
1 vote
2 answers
149 views

I am interested in the effect of prevalence on prediction performance. Chouldechova (2016) states that: [w]hen using a test-fair [recidivism prediction instrument] in populations where recidivism ...
Max J.'s user avatar
  • 123
1 vote
1 answer
134 views

Consider a classification problem where there are N classes. While this may seem strange, I have a model that processes features, and essentially, evaluate which classes are impossible (or near ...
Ralff's user avatar
  • 252
0 votes
0 answers
34 views

What can I do, to assess a classifiers accuracy, when class presence is scarce. Setup 1: I have 1000 boxes, 500 contain gold. I build an automated tool to find the gold. The recommended approach would ...
Klops's user avatar
  • 188
0 votes
1 answer
105 views

I have a conceptual question: after dividing a dataset into a training and test set (70:30), both are balanced and shuffled, should I use the Confusion Matrix and the ROC curve of a model generated by ...
darwinrgv's user avatar
1 vote
1 answer
63 views

Say one continuous variable differentiates between disease and nondisease quite accurately, but as people progress in age, this variable becomes less accurate. Is there a way to determine the accuracy ...
Abdulrazzaq Alheraky's user avatar
7 votes
3 answers
1k views

I have applied various ML models (fundamental and ensemble) to the same dataset for classification problem solving. AdaBoost, Bagging, and XGBoost classifiers gave the best accuracies. However, they ...
user366312's user avatar
  • 2,077
0 votes
0 answers
69 views

Suppose I have two predictions models, Model 1 and Model 2. I have a dataset containing observations, features and actual outcomes. For each observation, the “outcomes” (i.e. predictions) that the ...
Alex's user avatar
  • 101
0 votes
0 answers
108 views

I am trying to understand how to calculate one or more measures of statistical significance to display alongside metrics I've calculated from my data. Abbreviations I am using in the rest of this post:...
Natalia's user avatar
1 vote
1 answer
155 views

I have multiple moving averages forecasts that use different look back periods. I’m measuring accuracy using MAPE. Out of all the options, I want to select the best performing moving average. However, ...
Prasanth Regupathy's user avatar
11 votes
1 answer
3k views

I have a large dataset of the ASL (American Sign Language). I split this data into 70:15:15 for train, validation, test. I then trained a CNN model on it, where I trained using the 70%, and evaluated ...
codinator's user avatar
  • 123
3 votes
1 answer
109 views

Brier score can be computed for joint predictions of multiple variables, each with multiple categories. Let's say we have 4 variables with 3 possible classes each. In that case, the denominator of the ...
Antonello's user avatar
  • 413
1 vote
1 answer
84 views

I'm perplexed as to why my loss doesn't go up when the accuracy goes down (after about 40 epochs). Isn't it possible to tell overfitting from the loss curve alone? (I'm of course referring the ...
Tfovid's user avatar
  • 815
1 vote
0 answers
106 views

I'm looking at the SAMHSA Mental Health Client-Level dataset. I did some t-SNE plots (dropping irrelevant cols, normalizing some, one-hot encoding some) of 500k rows out of 6.5mil. I'm trying to do ...
Jackson Walters's user avatar
3 votes
2 answers
630 views

The title says it all: Is F-score the same as accuracy when there are only two classes of equal sizes? For my specific case, I have measurements of a group of people under two different situations and ...
user1596274's user avatar
1 vote
1 answer
86 views

I have made AdaBoost in Matlab. I get 88% in accuracy when I use Fisher's Iris flower set data. Here is the working example: ...
euraad's user avatar
  • 447
1 vote
1 answer
114 views

Assuming two binary (Y in {0, 1}) annotators or classifiers (A and B), that are: Conditionally independent, i.e. P(A=0, B=0|Y=1) = P(A=0|Y=1)*P(B=0|Y=1) and the same for Y=0. Better than random, i.e. ...
docaug's user avatar
  • 11
1 vote
1 answer
1k views

I'm responsible to forecast a portfolio of consumer products on a monthly basis, and in calculating forecast accuracy, I'm lead to the MAPE (Mean Average Percent Error), which is useful, but has, ...
Mark J's user avatar
  • 11
1 vote
1 answer
127 views

I have to simulate a simple sensor, which has 3 standard deviations defined in spherical frame : sigma-azimuth, sigma-elevation, sigma-distance. When I simulate a detection, I compute a noisy position ...
ConanLord's user avatar
0 votes
0 answers
89 views

that's my first question on here :) I am working with the kNN classifier on datasets from the multivariate normal distribution. I have to groups coming from ...
Superintendant's user avatar
0 votes
0 answers
112 views

I am interested in the audio classification problem. After labeling the audio recordings I have in Praat software environment, I extract the MFCC features from each labeled frame and create an SVM ...
Yalçın Cenik's user avatar
1 vote
1 answer
159 views

In papers about unsupervised clustering I see a lot of references to a metric "clustering accuracy" or "unsupervised clustering accuracy" (ACC) which is usually defined as ...
Cyo's user avatar
  • 11
2 votes
1 answer
490 views

I'm running a ML algorithm on some data, and I noticed that if I change the random state inside the train_test_split function, accuracy score change in a quite wide range. For example, with random ...
Federicofkt's user avatar
0 votes
0 answers
55 views

I am trying to do a binary classification on ticket canceling data from kaggle. I know this question has been asked before. For example here and here Summary of what I learned in those references: ...
wander95's user avatar
  • 101
3 votes
2 answers
132 views

I'm working on plotting census data, which has a fairly high non-response rate for some questions (5% or higher). This could actually shift the way we interpret the results in quite significant ways (...
Jeremy Kidwell's user avatar
2 votes
1 answer
586 views

I have a dataset with a class imbalance in favour of the positive class (85% occurence) I'm getting a fantastically calibrated probabilities profile but balanced accuracy is 0.65 and minority recall ...
Kat's user avatar
  • 21
1 vote
1 answer
797 views

Let's consider a true classification problem, that is, one where the predictor makes categorical predictions (not probabilities). It makes sense to assess the accuracy of such a predictor. However, ...
Dave's user avatar
  • 72.9k
6 votes
3 answers
3k views

So, the higher the confidence interval the lower the false positive rate, but the false negative rate will increase lowering the recall. Is it possible to determine which confidence interval is better/...
Ankita's user avatar
  • 129
2 votes
0 answers
169 views

Plenty has been discussed on Cross Validated about the drawbacks of classification accuracy when it comes to evaluating classification models. One good answer is here, for instance. Under what ...
Dave's user avatar
  • 72.9k
1 vote
0 answers
13 views

How do I improve the accuracy of the following data. It is from the following Kaggle competition which I am doing (despite it being closed for a school project). ...
Imme's user avatar
  • 11
1 vote
0 answers
52 views

here is the sample data. I have spectroscopy data as X-variables (from X1 to X80) and corresponding Y variable. I need to run plsr model in R using "pls" package. There are two sheets. In ...
MGD's user avatar
  • 11
0 votes
0 answers
87 views

Let's suppose that I'm trying to predict a stochastic forecast with machine learning models, and I don't have missing, null/NaN values and outliers. Also suppose that there is an error for the ...
Daniel_DS's user avatar
1 vote
2 answers
148 views

I did ANN classification on training data with oversampling and without oversampling. For each data, the smallest validation loss is sought with trial and error of 18 models. In the data without ...
andryan86's user avatar
  • 147
0 votes
1 answer
119 views

I am looking for a formula to help me determine the accuracy of a population. Here is my business problem. I have about 1 million scanned documents of many types that are currently unclassified ...
Alex 's user avatar
0 votes
0 answers
64 views

I developed a ML algorithm (Xgboost) to predict a target in my data set. I obtain here the results of my predictions on my test set : ...
Nicolas's user avatar
  • 13
0 votes
0 answers
72 views

I did random oversampling to handle unbalanced positive and negative data. When I didn't do random oversampling, the accuracy I got was 88%, when I oversampled the train data, it got 87% accuracy and ...
andryan86's user avatar
  • 147
1 vote
1 answer
148 views

I have an instrument that measures a value. It is only possible to measure the value once i.e. the experiment can't be repeated (think recording a car's speed as it drives past). The instrument is not ...
Chuck's user avatar
  • 91
1 vote
0 answers
226 views

I'm doing binary classification in Python with an SVM classifier, and I implemented stratified repeated cross validation to have more robust results. I would like to calculate confidence intervals for ...
Ed9012's user avatar
  • 471
2 votes
1 answer
142 views

Given a set of 128x128 images from three classes, I obtained an accuracy of 50% with a SVM on the flattened images (16384 'features'). Is this an upper bound on the performance of a SVM using any ...
Christian's user avatar
  • 193

1
2 3 4 5
17