9

I am tying to plot an ROC curve for Binary classification using RandomForestClassifier

I have two numpy arrays one contains predicted values and one contains true values as follows:

In [84]: test
Out[84]: array([0, 1, 0, ..., 0, 1, 0])

In [85]: pred
Out[85]: array([0, 1, 0, ..., 1, 0, 0])

How do I port ROC curve and obtain AUC (Area Under Curve) for this binary classification result in ipython?

1 Answer 1

14

You need probabilities to create ROC curve.

In [84]: test
Out[84]: array([0, 1, 0, ..., 0, 1, 0])

In [85]: pred
Out[85]: array([0.1, 1, 0.3, ..., 0.6, 0.85, 0.2])

Example code from scikit-learn examples:

import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, auc
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(2):
    fpr[i], tpr[i], _ = roc_curve(test, pred)
    roc_auc[i] = auc(fpr[i], tpr[i])

print roc_auc_score(test, pred)
plt.figure()
plt.plot(fpr[1], tpr[1])
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic')
plt.show()
Sign up to request clarification or add additional context in comments.

3 Comments

check if the length of shape[0] of test or pred is not equal to 0. if it is use anyarray.reshape(-1) . you can obtain probabilities using model.predict_proba(testdata)[:, 1]
I got a keyerror at plt.plot(fpr[2], tpr[2]) I changed it to 1 ... Everything else worked !!!
fpr[2] in the example is because there were 3 classes. For binary classification, just compute fpr, tpr, _ = roc_curve(y_test, y_score) and plot x=fpr, y=tpr.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.