Skip to main content
Tweeted twitter.com/StackStats/status/1615997491249401858
added 2 characters in body
Source Link
yliueagle
  • 855
  • 2
  • 8
  • 11

For a linear model, $y=\beta_0+x\beta+\varepsilon$, the shrinkage term is always $P(\beta) $.

What is the reason that we do not shrink the bias (intercept) term $\beta_0$? Should we shrink the bias term in the neural network models?

For linear model, $y=\beta_0+x\beta+\varepsilon$, the shrinkage term is always $P(\beta) $.

What is the reason that we do not shrink the bias (intercept) term $\beta_0$? Should we shrink the bias term in the neural network models?

For a linear model $y=\beta_0+x\beta+\varepsilon$, the shrinkage term is always $P(\beta) $.

What is the reason that we do not shrink the bias (intercept) term $\beta_0$? Should we shrink the bias term in the neural network models?

edited tags
Link
amoeba
  • 109.1k
  • 37
  • 325
  • 350
light editing
Source Link
amoeba
  • 109.1k
  • 37
  • 325
  • 350

Reason for not shrinkshrinking the bias (intercept) term in regression

For linear model, $y=\beta_0+x*\beta+\varepsilon$$y=\beta_0+x\beta+\varepsilon$, the shrinkage term is always like $P(\beta) $.

What'sWhat is the reason that we do not shrink the bias (intercept) term $\beta_0$? Comparatively, shouldShould we shrink the bias term in the Neuralneural network modelmodels?

Reason for not shrink the bias term

For linear model, $y=\beta_0+x*\beta+\varepsilon$, the shrinkage term is always like $P(\beta) $.

What's the reason we do not shrink the bias term $\beta_0$? Comparatively, should we shrink the bias term in the Neural network model?

Reason for not shrinking the bias (intercept) term in regression

For linear model, $y=\beta_0+x\beta+\varepsilon$, the shrinkage term is always $P(\beta) $.

What is the reason that we do not shrink the bias (intercept) term $\beta_0$? Should we shrink the bias term in the neural network models?

Source Link
yliueagle
  • 855
  • 2
  • 8
  • 11
Loading