0
$\begingroup$

I am inquiring about a particular result shown in section 2.1 of Grace Wahba's book (pages 23-24), Spline models for observational data.

The setup is as follows. We consider the data model

$$y_i = f\left(\frac i n \right) + \epsilon_i, \quad \text{for } i=1,\ldots, n.$$ We assume $f$ belongs to a Sobolev space with periodic boundary conditions. The author seeks a solution to this, in the form

$$f_\lambda(t) = a_0 + \sum_{k=1}^{n/2 -1} \left( a_k \sqrt{2} \cos(2 \pi kt) + b_k\sqrt{2} \sin (2 \pi k t) \right) + a_{n/2} \cos (\pi n t)$$

minimizing the following penalized criterion

$$\frac 1n \sum_{i=1}^n \left(y_i - f \left( \frac i n \right)\right)^2 + \lambda \int_0^1 (f^{(m)} (u))^2 du$$ for $\lambda >0$ and a nonnegative integer $m$.

The author then derives optimal coefficient values. My question pertains to only two of them, so I will ignore the rest. Let $k = 1, \ldots, n/2 - 1$. Then:

$$a_k = \frac{\sqrt{2}}{n} \sum_{i=1}^n \cos\left(2 \pi k \frac i n\right) f \left( \frac i n\right)$$ $$b_k = \frac{\sqrt{2}}{n} \sum_{i=1}^n \sin\left(2 \pi k \frac i n\right) f \left( \frac i n\right).$$

Next the author then defines the estimates of these terms, given by replacing the $f$ term with the data:

$$\hat{a}_k = \frac{\sqrt{2}}{n} \sum_{i=1}^n \cos\left(2 \pi k \frac i n\right)y_i$$ $$\hat{b}_k = \frac{\sqrt{2}}{n} \sum_{i=1}^n \sin\left(2 \pi k \frac i n\right) y_i.$$

The author redefines the minmization criterion (in a step I don't fully understand myself), and then claims the minimizing values are of the form

$$a_k = \hat{a}_k / (1 + \lambda (2 \pi k)^{2m})$$ $$b_k = \hat{b}_k / (1 + \lambda (2 \pi k)^{2m})$$

Then, the estimated function with penalty $\lambda$ (ignoring the other terms, and written exactly as it appears in the text otherwise):

$$f_\lambda (t) \propto \sum_{k=1}^{n/2 -1}\left( \frac{\hat{a}_k}{1 + \lambda (2 \pi k)^{2m}} \cos(2 \pi kt) + \frac{\hat{b}_k}{1 + \lambda (2 \pi k)^{2m}}\sqrt{2} \sin (2 \pi k t) \right) .$$

Considering this is a highly cited text with seemingly no errata anywhere, I wanted to ask: is there a factor of $\sqrt{2}$ missing in the final expression for $f_{\lambda}(t)$? Such a factor appears in the solution form definition, but disappears in the final expression.

I implemented this procedure in R and found that the extra $\sqrt{2}$ which appears missing is necessary for a good fit, and without it, the amplitude does not match the data.

$\endgroup$

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.