How to compare a Log-Log regression models with a Support Vectors Machine model (SVM)?

Question

I have developed a log-log model which gives me a rmse of 0.1. I want to compare the results with a SVM model. In the SVM i didn't initially use the log transformed variables. RMSE from the non-transformed predictors is 3.9.

If i am to compare the two models, should i use the transformed variables in SVM and then compare that rmse with that of the linear model or is there a way to back-transform the rmse from the linear model to compare it with the SVM model.

Regards

Are you saying that you are trying to predict y and you train a linear model with log(y) as the output and an SVM model with y as the output ? — Romain
– Romain, Commented May 1, 2016 at 23:50
@Romain: Yes. For the linear model i transformed both the response and predictors due to non-constant variance, but since those assumptions dont hold for SVM i modeled using the original predictors rather than the transformed ones. — Raj
– Raj, Commented May 2, 2016 at 0:09

Romain · Accepted Answer · 2016-05-02 13:19:01Z

Let consider a classic ML problem: $X_{train}$ (the data for training), $y_{train}$ (the response for training), $X_{test}$ (the data for testing), $y_{test}$ (the data for testing).

You are using 2 models: linear regression ($LinReg$) and the $SVM$ and you train them in the following way:

Linear Regression:

transform some variables $X_{train,transform} = f(X_{train})$

Then train: $log(y_{train}) = LinReg(X_{train,transform})$
SVM:

Train $y_{train} = SVM(X_{train})$

To predict you go through the same steps:

Linear Regression:

transform using previous transformation $f$. $X_{test,transform} = f(X_{test})$

Then get the $y's$: $\hat{y}_{test} = exp(LinReg(X_{test,transform}))$
SVM:

Train $\hat{y}_{test} = SVM(X_{test})$

If you want to compare the 2 models you can use either log or non log metric. Without the log:

$RMSE^{SVM} = \|\hat{y}_{test} - y_{test}\|/\sqrt{n} = \|SVM(X_{test}) - y_{test} \|/\sqrt{n}$
$RMSE^{Reg} = \|\hat{y}_{test} - y_{test}\|/\sqrt{n} = \|LinReg(X_{test,transform}) - y_{test} \|/\sqrt{n}$

With the log:

$RMSE^{SVM} = \|log(\hat{y}_{test}) - log(y_{test})\|/\sqrt{n} = \|log(SVM(X_{test})) - log(y_{test}) \|/\sqrt{n}$
$RMSE^{Reg} = \|log(\hat{y}_{test}) - log(y_{test})\|/\sqrt{n} = \|log(LinReg(X_{test,transform})) - log(y_{test}) \|/\sqrt{n}$

With $n$ the number of points in the testing set and $\|.\|$ the euclidean norm.

Finally if you want you can also re-compute training RMSE (with or without log) by just replacing $test$ with $train$ in above equations. Hope this answer your question.

Just wanted to point out that the above is incorrect. Taking the log of a transformed value and an untransformed value does not make the two values comparable. See my answer below — kalidurge
– kalidurge, Commented Jan 4, 2019 at 22:31

Ferdi · Accepted Answer · 2018-10-22 16:22:10Z

$\hat{y}$ and $y_{test}$ should be back-transformed/"un-logged" in order for the RMSE of the log-log regression model to be comparable to SVM

MSE(log-log model) =$ \frac{(e^\hat{yi} - e^{yi-test})^{2}}{n} $

RMSE (log-log model) = ${\sqrt{ MSE}}$

Compute RMSE for SVM as normal

Stack Exchange Network

How to compare a Log-Log regression models with a Support Vectors Machine model (SVM)?

2 Answers 2

Hot Network Questions

How to compare a Log-Log regression models with a Support Vectors Machine model (SVM)?

2 Answers 2

Related

Hot Network Questions