0
$\begingroup$

Let $f : \mathbb{R}^n \rightarrow \mathbb{R}$ continuously differentiable. We assume that there exists $L > 0$ such that \begin{equation} \|∇f(x)-∇f(x')\| \leq L\|x − x'\| \qquad \forall (x,x') \in \mathbb{R}^n \times \mathbb{R}^n. \end{equation}

Show that \begin{equation} |f(x + h) − f(x) − \langle ∇f(x), h\rangle | \leq \frac{L}{2} \|h\|^2 \qquad \forall (x, h) \in \mathbb{R}^n \times \mathbb{R}^n. \end{equation}

I do not understand part of the demonstration of which here is:

As $f$ is continuously differentiable, we have from Taylor's formula at zero order with residual in integral form \begin{equation} f (x + h) = f(x) + \int^1_0 \langle ∇f(x + th), h \rangle dt. \tag{1} \end{equation}


Me, when I apply Taylor's formula at zero order with residual in integral form, we get \begin{equation} f (x + h) = f(x) + \int_x^{x+h} ∇f(t) dt. \tag{2} \end{equation} What is the integration process that allows us to leave (2) to (1) ?

$\endgroup$
1
  • $\begingroup$ You forgot the dot product since your $t$ (and hence $\mathrm{d}t$) is $\mathbb{R}^n$-valued as it goes from $x$ to $x+h$. Now write $\vec{t}=\vec{x}+\tau\vec{h}$ for the integral. $\endgroup$ Commented Dec 23, 2020 at 15:27

2 Answers 2

0
$\begingroup$

By the fundamental theorem of calculus \begin{aligned} f(\mathbf{x}+\mathbf{h}) - f(\mathbf{x}) &= \int_0^1\nabla f(\mathbf{x}+t\mathbf{h})^T\mathbf{h}\,dt \\ &=\nabla f(\mathbf{x})^T\mathbf{h}+ \int_0^1\big(\nabla f(\mathbf{x}+t\mathbf{h}) - \nabla f(\mathbf{x})\big)^T\mathbf{h}\,dt \end{aligned} and therefore \begin{aligned} |f(\mathbf{x}+\mathbf{h}) - f(\mathbf{x})-\nabla f(\mathbf{x})^T\mathbf{h}| &=\Big|\int_0^1\big(\nabla f(\mathbf{x}+t\mathbf{h}) - \nabla f(\mathbf{x})\big)^T\mathbf{h}\,dt \Big|\\ &\leq \int_0^1\big|\big(\nabla f(\mathbf{x}+t\mathbf{h}) - \nabla f(\mathbf{x})\big)^T\mathbf{h}\big|\,dt \\ &\overset{\text{C.S.}}\leq \int_0^1\|\big(\nabla f(\mathbf{x}+t\mathbf{h}) - \nabla f(\mathbf{x})\big)\|\|\mathbf{h}\|\,dt \\ &\leq\int_0^1tL\|\mathbf{h}\|^2\,dt \\ &\leq \|\mathbf{h}\|^2 \end{aligned} where C.S. is Cauchy-Schwarz and the inequality following it is from the fact that $f$ is $L$-smooth.

For general knowledge, this inequality is known as the Descent Lemma, and it's very useful in convex analysis. Analysis of optimization algorithms strongly relies on the ability to quantify the change in function value when moving from some point $\mathbf{x}$ in a direction $\mathbf{h}$.

$\endgroup$
0
$\begingroup$

$\nabla f(t)$ isn't a number. The output of $f$ is a number, so $f(x+h)$ will surely be a number. That means we must have the thing we integrate be a number! The Taylor's theorem you use is perhaps a naive generalization of the one from single variate calculus, but does not hold in multivariate calculus. The form your text states is the correct generalization.

$\endgroup$
0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.