I am using python 3.5.2 and sklearn 0.19.1
I have a muticlass problem (3 classes) and I am using RandomForestClassifier
. For one of the cass I have 19 unique predict_proba
values :
{0.0,
0.6666666666666666,
0.6736189855024448,
0.6773290780865037,
0.7150826826468751,
0.7175236925236925,
0.7775446850962057,
0.8245648135911781,
0.8631035080004867,
0.8720525244880196,
0.8739595855873906,
0.8787152225755167,
0.9289844333343654,
0.954439314892936,
0.9606503912532541,
0.9771342285323964,
0.9883370916703461,
0.9957401423931763,
1.0}
I am computing roc_curve
and I am expecting the same number of point for the roc curve as I have unique value of probablitity. This is only true for 2 of the 3 classes!
When I looked at the thresholds returned that the roc_curve
function:
fpr, tpr, proba = roc_curve(....)
:
I see the same exact value as the one in the list of probability + one new value 2.0 !
[2.,
1.,
0.99574014,
0.98833709,
0.97713423,
0.96065039,
0.95443931,
0.92898443,
0.87871522,
0.87395959,
0.87205252,
0.86310351,
0.82456481,
0.77754469,
0.71752369,
0.71508268,
0.67732908,
0.67361899,
0.66666667,
0. ]
Why is a new thresholds 2.0 is returned ? I didn't see anything related to that in the documentation.
Any idea ? I am missing something
roc_curve
is written so that ROC point corresponding to the highest threshold (fpr[0]
, tpr[0]
) is always (0, 0). If this is not the case, a new threshold is created with an arbitrary value of max(y_score)+1
. The relevant code from the source:
thresholds : array, shape = [n_thresholds]
Decreasing thresholds on the decision function used to compute
fpr and tpr. `thresholds[0]` represents no instances being predicted
and is arbitrarily set to `max(y_score) + 1`.
and
if tps.size == 0 or fps[0] != 0:
# Add an extra threshold position if necessary
tps = np.r_[0, tps]
fps = np.r_[0, fps]
thresholds = np.r_[thresholds[0] + 1, thresholds]
So it seems in the case you showed you have data given a score of 1.0
that is incorrectly classified.
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加