What is the difference between residual sum of squares and ordinary least squares?

Question

They look like the same thing to me but I'm not sure.

Update: in retrospect, this was not a very good question. OLS refers to fitting a line to data and RSS is the cost function that OLS uses. It finds the parameters that gives the least residual sum of square errors. It is called ordinary in OLS refers to the fact that we are doing a linear fit.

Dawny33 · Accepted Answer · 2016-04-28 04:58:43Z

Here is a definition from Wikipedia:

In statistics, the residual sum of squares (RSS) is the sum of the squares of residuals. It is a measure of the discrepancy between the data and an estimation model; Ordinary least squares (OLS) is a method for estimating the unknown parameters in a linear regression model, with the goal of minimizing the differences between the observed responses in some arbitrary dataset and the responses predicted by the linear approximation of the data.

So RSS is a measure of how good the model approximates the data while OLS is a method of constructing a good model.

$\begingroup$ You have no idea how helpful your answer is! $\endgroup$

NoName
– NoName

2020-03-12 00:19:57 +00:00
Commented Mar 12, 2020 at 0:19 — NoName
– NoName, Commented Mar 12, 2020 at 0:19

krishna Prasad · Accepted Answer · 2016-04-28 05:39:15Z

Ordinary least squares (OLS)

Ordinary least squares (OLS) is the workhorse of statistics. It gives a way of taking complicated outcomes and explaining behaviour (such as trends) using linearity. The simplest application of OLS is fitting a line.

Residuals

Residuals are the observable errors from the estimated coefficients. In a sense, the residuals are estimates of the errors.

Let's explain the things using R code:

First fit a ordinary least square line of diamond datasets in UsingR library:

library(UsingR) data("diamond") y <- diamond$price	x <- diamond$carat n <- length(y) olsline <- lm(y ~ x) plot(x, y, main ="Odinary Least square line", xlab = "Mass (carats)", ylab = "Price (SIN $)", bg = "lightblue", col = "black", cex = 2, pch = 21,frame = FALSE) abline(olsline, lwd = 2)

Now, Let's calculate the residual i.e residual sum of squares: In R you can easily calculate the residual as resid(olsline), for visualisation let's calculate it manually:

# The residuals from R method e <- resid(olsline) ## Obtain the residuals manually, get the predicated Ys first yhat <- predict(olsline) # The residuals are y -yhat, Let's check by comparing this with R's build in resid function ce <- y - yhat max(abs(e-ce)) ## Let's do it again hard coding the calculation of Yhat max(abs(e- (y - coef(olsline)[1] - coef(olsline)[2] * x))) # Residuals arethe signed length of the red lines plot(diamond$carat, diamond$price, main ="Residuals sum of (actual Y - predicted Y)^2", xlab = "Mass (carats)", ylab = "Price (SIN $)", bg = "lightblue", col = "black", cex = 2, pch = 21,frame = FALSE) abline(olsline, lwd = 2) for (i in 1 : n) lines(c(x[i], x[i]), c(y[i], yhat[i]), col = "red" , lwd = 2)

Hope these visualization will clear your doubts between RSS & OLS

Reference: Coursera Regression Models class, I have recently completed it. — krishna Prasad
– krishna Prasad, Commented Apr 28, 2016 at 4:06

Apoorv Bhargava · Accepted Answer · 2017-07-19 07:41:01Z

In a way, OLS is a model to estimate the regression line based on training data. While, RSS is a parameter to know the accuracy of model for both testing and training data.

Adam Al-Rahman · Accepted Answer · 2025-01-01 14:18:08Z

The mathematical model for both remains the same i.e,

$$ \begin{equation} RSS(\beta) = \sum_{i = 1}^{N} (\hat{\large \epsilon_{i}})^2 = \sum_{i = 1}^{N} \; (y_{i} - \sum^{p}_{j = 0}x_{ij} \; \beta_{j})^2 \end{equation} $$

The difference,

RSS: It is a loss function
OLS: It uses that loss function to predict the dependent variable for linear regression by minimizing it.

$$ \begin{equation} \hat{Y} = \underbrace{ \sum_{j=0}^{p} X_j \hat{\beta_{j}} }_{\textsf{ Linear Equation}} \quad\textsf{where;}\hspace{1mm} X_{0} = 1 \end{equation} $$

Note: I am including the bias part in the equation. So not to make it an Affine function.

First, it estimates the parameters ($\beta$) using linear algebras, first and second order differential, then the output will be

$$ \begin{equation} \hat{\beta} = (X^TX)^{-1}X^TY \end{equation} $$

The prediction value is defined as,

$$ \begin{equation} \hat{y} = X\hat{\beta} = X(X^TX)^{-1} X^Ty \tag{1.3} \end{equation} $$

OLS differs from Least squares (it covers both linear & non-linear) so be careful when reading resources.

Note: OLS is a closed-form method whereas gradient descent is an iterative method. — Adam Al-Rahman
– Adam Al-Rahman, Commented Jan 1 at 14:20

Stack Exchange Network

What is the difference between residual sum of squares and ordinary least squares?

4 Answers 4

Hot Network Questions

What is the difference between residual sum of squares and ordinary least squares?

4 Answers 4

Related

Hot Network Questions