10
$\begingroup$

They look like the same thing to me but I'm not sure.

Update: in retrospect, this was not a very good question. OLS refers to fitting a line to data and RSS is the cost function that OLS uses. It finds the parameters that gives the least residual sum of square errors. It is called ordinary in OLS refers to the fact that we are doing a linear fit.

$\endgroup$

4 Answers 4

16
$\begingroup$

Here is a definition from Wikipedia:

In statistics, the residual sum of squares (RSS) is the sum of the squares of residuals. It is a measure of the discrepancy between the data and an estimation model; Ordinary least squares (OLS) is a method for estimating the unknown parameters in a linear regression model, with the goal of minimizing the differences between the observed responses in some arbitrary dataset and the responses predicted by the linear approximation of the data.

So RSS is a measure of how good the model approximates the data while OLS is a method of constructing a good model.

$\endgroup$
1
  • $\begingroup$ You have no idea how helpful your answer is! $\endgroup$ Commented Mar 12, 2020 at 0:19
5
$\begingroup$

Ordinary least squares (OLS)

Ordinary least squares (OLS) is the workhorse of statistics. It gives a way of taking complicated outcomes and explaining behaviour (such as trends) using linearity. The simplest application of OLS is fitting a line.

Residuals

Residuals are the observable errors from the estimated coefficients. In a sense, the residuals are estimates of the errors.

Let's explain the things using R code:

First fit a ordinary least square line of diamond datasets in UsingR library:

library(UsingR) data("diamond") y <- diamond$price	x <- diamond$carat n <- length(y) olsline <- lm(y ~ x) plot(x, y, main ="Odinary Least square line", xlab = "Mass (carats)", ylab = "Price (SIN $)", bg = "lightblue", col = "black", cex = 2, pch = 21,frame = FALSE) abline(olsline, lwd = 2) 

enter image description here

Now, Let's calculate the residual i.e residual sum of squares: In R you can easily calculate the residual as resid(olsline), for visualisation let's calculate it manually:

# The residuals from R method e <- resid(olsline) ## Obtain the residuals manually, get the predicated Ys first yhat <- predict(olsline) # The residuals are y -yhat, Let's check by comparing this with R's build in resid function ce <- y - yhat max(abs(e-ce)) ## Let's do it again hard coding the calculation of Yhat max(abs(e- (y - coef(olsline)[1] - coef(olsline)[2] * x))) # Residuals arethe signed length of the red lines plot(diamond$carat, diamond$price, main ="Residuals sum of (actual Y - predicted Y)^2", xlab = "Mass (carats)", ylab = "Price (SIN $)", bg = "lightblue", col = "black", cex = 2, pch = 21,frame = FALSE) abline(olsline, lwd = 2) for (i in 1 : n) lines(c(x[i], x[i]), c(y[i], yhat[i]), col = "red" , lwd = 2) 

enter image description here

Hope these visualization will clear your doubts between RSS & OLS

$\endgroup$
1
1
$\begingroup$

In a way, OLS is a model to estimate the regression line based on training data. While, RSS is a parameter to know the accuracy of model for both testing and training data.

$\endgroup$
1
$\begingroup$

The mathematical model for both remains the same i.e,

$$ \begin{equation} RSS(\beta) = \sum_{i = 1}^{N} (\hat{\large \epsilon_{i}})^2 = \sum_{i = 1}^{N} \; (y_{i} - \sum^{p}_{j = 0}x_{ij} \; \beta_{j})^2 \end{equation} $$

The difference,

RSS: It is a loss function
OLS: It uses that loss function to predict the dependent variable for linear regression by minimizing it.

$$ \begin{equation} \hat{Y} = \underbrace{ \sum_{j=0}^{p} X_j \hat{\beta_{j}} }_{\textsf{ Linear Equation}} \quad\textsf{where;}\hspace{1mm} X_{0} = 1 \end{equation} $$

Note: I am including the bias part in the equation. So not to make it an Affine function.

First, it estimates the parameters ($\beta$) using linear algebras, first and second order differential, then the output will be

$$ \begin{equation} \hat{\beta} = (X^TX)^{-1}X^TY \end{equation} $$

The prediction value is defined as,

$$ \begin{equation} \hat{y} = X\hat{\beta} = X(X^TX)^{-1} X^Ty \tag{1.3} \end{equation} $$

OLS differs from Least squares (it covers both linear & non-linear) so be careful when reading resources.

$\endgroup$
1
  • $\begingroup$ Note: OLS is a closed-form method whereas gradient descent is an iterative method. $\endgroup$ Commented Jan 1 at 14:20

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.