Timeline for Philosophical question on logistic regression: why isn't the optimal threshold value trained?

Current License: CC BY-SA 4.0

10 events

when toggle format	what		by	license	comment
May 2, 2019 at 20:54	comment	added	LSC		Not to be political, but there is often a lack of statistical knowledge by those teaching "machine learning" and "big data" courses where statistical methods like logistic regression are employed and somewhat abused; that is, statisticians who vastly understand the methodologies and how to employ them in new scenarios aren't the ones teaching these subjects. It's hard to imagine the cases where only caring about accuracy is what matters. I second reading Frank Harrell's blog posts about this as referenced in the "answer" above.
Apr 26, 2019 at 17:12	comment	added	StatsSorceress		Fair enough Wayne! Personally, I've found that in course work, the objective was to minimize misclassification error, not to worry about precision/recall, so I think the comment still has some merit for those in a course-based setting who are wondering why the two separate steps are necessary if all we want to do is "get the class right".
Apr 26, 2019 at 12:10	comment	added	Wayne		@StatsSorceress "... sometimes in machine learning classification ...". There should be a big emphasis on sometimes. It's hard to imagine a project where accuracy is the correct answer. In my experience, it always involves precision and recall of a minority class.
Apr 26, 2019 at 11:10	history	edited	gung - Reinstate Monica	CC BY-SA 4.0	added 32 characters in body
Apr 25, 2019 at 20:11	vote	accept	StatsSorceress
Apr 25, 2019 at 18:13	comment	added	gung - Reinstate Monica		As I said, you very much can set up your own custom optimization that will train the model & select the threshold simultaneously. You just have to do it yourself & the final model is likely to be poorer by most standards.
Apr 25, 2019 at 16:29	comment	added	StatsSorceress		Hmm. I read the accepted answer in the related question here, and I agree with it in theory, but sometimes in machine learning classification applications we don't care about the relative error types, we just care about "correct classification". In that case, could you train end-to-end as I describe?
Apr 25, 2019 at 16:02	comment	added	gung - Reinstate Monica		You certainly could (@Sycorax's answer speaks to that possibility). But because that isn't what LR itself is, but rather some ad hoc augmentation, you would need to code up the full optimization scheme yourself. Note BTW, that Frank Harrell has pointed out that process will lead to what might be considered an inferior model by many standards.
Apr 25, 2019 at 15:55	comment	added	StatsSorceress		Okay, I understand that part of the theory (thank you for that eloquent explanation!) but why can't we incorporate the classification aspect into the model? That is, why can't we find p, then find the threshold, and train the whole thing end-to-end to minimize some loss?
Apr 25, 2019 at 15:43	history	answered	gung - Reinstate Monica	CC BY-SA 4.0