Logistic regression cost function

Question

In Aurelien Geron's book I found this line

This cost function makes sense because –log(t) grows very large when t approaches 0, so the cost will be large if the model estimates a probability close to 0 for a positive instance, and it will also be very large if the model estimates a probability close to 1 for a negative instance. On the other hand, – log(t) is close to 0 when t is close to 1, so the cost will be close to 0 if the estimated probability is close to 0 for a negative instance or close to 1 for a positive instance, which is precisely what we want.

What I dont get is, How will the cost will be large if the model estimates a probability close to 0 for a positive instance, and it will also be very large if the model estimates a probability close to 1 for a negative instance?

TwinPenguins · Accepted Answer · 2018-11-10 07:21:35Z

The cost function of the Logistic Regression derived via Maximum Likelihood Estimation:

If y = 1 (positive): i) cost = 0 if prediction is correct (i.e. h=1), ii) cost $\rightarrow \infty $ if $h_{\theta}(x)\rightarrow 0$.
If y = 0 (negative): i) cost = 0 if prediction is correct (i.e. h=0), ii) cost $\rightarrow \infty$ if $(1-h_{\theta}(x))\rightarrow 0$.

The intuition is that larger mistakes should get larger penalties. Further readings, 1,2,3,4.

Yes, I get it. I was not able to digest, how the model penalises the cost function, whenever the difference between predicted and actual probablities are different. — Akash Dubey
– Akash Dubey, Commented Nov 10, 2018 at 6:54

Skiddles · Accepted Answer · 2018-11-09 21:25:37Z

Not trying to oversimplify the answer, but simply get a calculator to compute these manually and you can see this in action:

If t is close to 1, lets just say that is 0.9999 for the example, then: $$ -log(t) = -log(0.9999) = 0.000100005 $$

conversely,

If t is close to 0, lets just say that is 0.0001 for the example, then: $$ -log(t) = -log(0.0001) = 9.21034 $$

So if the probability is high, the cost function returns a small, but if the probability is low, the cost function returns a (relatively) large number.

Perhaps I missed the point of your question, in which case, I apologize.

Stack Exchange Network

Logistic regression cost function

2 Answers 2

Hot Network Questions

Logistic regression cost function

2 Answers 2

Related

Hot Network Questions