Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

5
  • $\begingroup$ The points on a ROC curve are obtained by varying the threshold over a predicted score or probability, so if you only have the final "hard" predictions you can't do a ROC curve. $\endgroup$ Commented Dec 7, 2019 at 17:38
  • $\begingroup$ @erwan can't i change the parameter manually and plot with some library that knows how to calculate false/true positive? $\endgroup$ Commented Dec 7, 2019 at 18:20
  • $\begingroup$ it's not about calculating the TP/FP: from your current output which is a single set of predictions, you will get 1 value for true positive rate and false positive rate, i.e. 1 single point for the curve. you need to have different "series" of predictions, each corresponding to one point, and this is possible if you get the probability instead of just the binary prediction, because you obtain different "series" of predictions by moving the threshold over the probabilities. $\endgroup$ Commented Dec 7, 2019 at 18:48
  • $\begingroup$ @Erwan I do have control over the C parameter of the LinearSVC. Can't I just choose many values, train it for them, and see all the values that come out? $\endgroup$ Commented Dec 7, 2019 at 18:54
  • 1
    $\begingroup$ you could try, but it's unlikely to give you the same kind of variation as varying a threshold so you might end up with just a bunch of points in the same area instead of a ROC progression. If you really want a ROC and can't have probabilities you could train a regression model instead of a classification model: in the training set 0 and 1 would be treated as real numbers, so in the test set you would get predictions mostly between 0 and 1: these predicted values could be used as "probabilities". $\endgroup$ Commented Dec 7, 2019 at 19:28