Can I utilize Ridge Regression to update coefficients of a Linear Regression model for a new dataset?

Question

I have fitted a Linear Regression Model using one dataset. Now, I have another smaller dataset that I want to refine the model with. Can I use Ridge regression to update the estimated coefficients for this new dataset? Or do you recommend a more appropriate approach?

Edit: after the useful Answer by John Madden, I have implemented his approach using Python, as follows, do you find any issue in this code in reflecting that logic:

# Step 1: Compute coefficients on dataset (X1, y1) beta1 = np.linalg.pinv(X1) @ y1 print("Original coefficients:", beta1) # Step 2: Compute residuals on dataset (X2, y2) y_hat2 = X2 @ beta1 residual2 = y2 - y_hat2 # Step 3: Perform ridge regression on residuals ridge_reg = Ridge(alpha=alpha) ridge_reg.fit(X2, residual2) delta = ridge_reg.coef_ # Step 4: Adjust coefficients beta2 = beta1 + delta print("Adjusted coefficients:", beta2)

Do you mean that you want to know what the regression on both data sets combined would have been if you calculated on all data when you had the chance? — Dave
– Dave, Commented Apr 16, 2024 at 21:14
I want to update the coefficient of the first linear regression model using the dataset2 — Adham Enaya
– Adham Enaya, Commented Apr 16, 2024 at 21:16
Using the coefficients from the first model as the start point of the second regression — Adham Enaya
– Adham Enaya, Commented Apr 16, 2024 at 21:20
@John The question does not refer to "transfer" explicitly: it uses the phrase "refine the model with." — whuber
– whuber ♦, Commented Apr 17, 2024 at 14:47

Nathan Wycoff · Accepted Answer · 2024-04-17 02:44:01Z

I think that what you mean is that you have some coefficients estimated from a first dataset $(\mathbf{X}_1,\mathbf{y}_1)$ denoted as $\hat{\beta}_1$. Then, for dataset $(\mathbf{X}_2, \mathbf{y}_2)$, you want to do something like ridge regression, but instead of penalizing the $\ell_2$ norm of $\hat\beta_2$, you want to penalize the deviation between $\hat\beta_2$ and $\hat\beta_1$.

Or mathematically, $$ \underset{\beta}{\min} \Vert \mathbf{y}_2-\mathbf{X}_2\beta\Vert_2^2 + \lambda\Vert\beta-\hat\beta_1\Vert_2^2 \, . $$

If that's right, you can accomplish this by noting that: $$ \underset{\beta}{\min} \Vert \mathbf{y}_2-\mathbf{X}_2\beta\Vert_2^2 + \lambda\Vert\beta-\hat\beta_1\Vert_2^2 \iff \underset{\delta}{\min} \Vert \mathbf{y}_2-\mathbf{X}_2(\hat\beta_1+\delta)\Vert_2^2 + \lambda\Vert\delta\Vert_2^2 \\ \iff \underset{\delta}{\min} \Vert (\mathbf{y}_2-\mathbf{X}_2\hat\beta_1) -\mathbf{X}_2\delta)\Vert_2^2 + \lambda\Vert\delta\Vert_2^2 $$

This suggests the following procedure:

Compute an estimate of $\hat\beta_1$ on the dataset $(\mathbf{X}_1,\mathbf{y}_1)$ using a procedure of your choice.
Compute the residuals using the coefficients learnt from $(\mathbf{X}_1,\mathbf{y}_1)$ as $\tilde{\mathbf{y}}_2 = \mathbf{y}_2 - \mathbf{X}_2\hat\beta_1$.
Do a standard ridge regression on $\tilde{\mathbf{y}}_2$ against $\mathbf{X}_2$ to get deviation coefficients $\delta$.
Set $\hat\beta_2 = \hat\beta_1 + \delta$.

For the properties of such a procedure in the context of Lasso rather than Ridge regression, see <Li 2022>. Regrettably, I don't know of the analysis for the Ridge regression case.

thanks, a lot. I think this your solution is similar to this one: adapt-python.github.io/adapt/generated/… — Adham Enaya
– Adham Enaya, Commented Apr 16, 2024 at 21:48
thanks. I have implemented your logic in python could you please have a look. — Adham Enaya
– Adham Enaya, Commented Apr 17, 2024 at 9:02
I have a follow-up question, please. my regression process is GLM-based, it is clear what I have to do in steps 1 and 2 of your proposed procedure, but I am not very sure about step 3, should I use the normal ridge regression, or a 'regulatized' version of my GLM model? — Adham Enaya
– Adham Enaya, Commented Apr 24, 2024 at 12:20
@AdhamEnaya good question. Ideally, we would use a regularized version of your GLM. But the hard part is that the residuals from dataset 2 may no longer follow the appropriate distribution for your GLM. This paper discusses the situation <arxiv.org/pdf/2105.14328.pdf>, but I think implementing it will require writing custom GLM code that allows for nonstandard regularization. — Nathan Wycoff
– Nathan Wycoff, Commented Apr 24, 2024 at 13:37

Stack Exchange Network

Can I utilize Ridge Regression to update coefficients of a Linear Regression model for a new dataset?

1 Answer 1

Linked

Hot Network Questions

Can I utilize Ridge Regression to update coefficients of a Linear Regression model for a new dataset?

1 Answer 1

Linked

Related

Hot Network Questions