Chain Rule to Compute Second Derivative

Question

I was going through Marsden's book, Elementary Classical Analysis, and came across the following exercise in Chapter 6. It reads as follows:

If $f: A \subset \mathbb{R}^n \to \mathbb{R}^m$ and $g: B \subset \mathbb{R}^m \to \mathbb{R}^p$, show that \begin{align*} D^2(g \circ f(x_0))(x, y) &= D^2(g(x_0)) (Df(x_0) \cdot x, Df(x_0) \cdot y) \\ &+\; Dg(f(x_0)) \cdot D^2f(x_0)(x, y). \end{align*}

I found this question concerning the same exercise, but my problem was not answered here. I know that I am supposed to apply the chain rule twice to compute this result. What I do not understand is why there is an addition involved in the result to begin with. How is the use of the product rule justified here?

If I apply the chain rule once, I get $$ D(g \circ f(x_0)) = Dg(f(x_0))) \circ Df(x_0),$$ where $Df : A \to L(\mathbb{R}^n, \mathbb{R}^m)$ and $Dg : B \to L(\mathbb{R}^m, \mathbb{R}^p)$. Clearly this is the composition of two linear transformations. But neither the product rule (introduced in the text to differentiate $gf$, where $f : A \subset \mathbb{R}^n \to \mathbb{R}^m$ and $g: A \to \mathbb{R}$) nor the chain rule applies here.

I know that we can view this equation in terms of matrix multiplication for suitably-chosen bases. But how can I differentiate the composition of linear transformations as written above? Can I view the composition of these linear operations as a bilinear form, and apply the generalized product rule to differentiate this bilinear form?

I give this as an exercise in my manifold theory course, too. The lecture notes are up on my webpage if you'd like to see the set-up. The key idea is to have two frameworks for the derivative. (1) Think of the derivative as a map of tangent bundles and (2) As a matrix that depends on points in $A$. Then express the 2nd derivative (tangent bundle formulation) in terms of the classical Hessian. From there the result pops out of two applications of the chain rule. $D^2(g \circ f) = D(Dg \circ Df) = D^2g \circ D^2f$. — Ryan Budney
– Ryan Budney, Commented Apr 19, 2014 at 20:22

Community · Accepted Answer · 2017-04-13 12:20:47Z

2

The composition of linear functions is bilinear: $$R(S+T)=RS+RT,$$ $$(S+T)R=SR+TR,$$ $$\cdots$$ See Derivative Bilinear map.

edited Apr 13, 2017 at 12:20

CommunityBot

1

answered Apr 19, 2014 at 20:28

Martín-Blas Pérez Pinilla

42.9k4 gold badges53 silver badges96 bronze badges

$\begingroup$ I see, this shows that the composition function $C: L(U, V) \times L(V, W) \to L(U, W)$ is linear. Therefore we can think of the composition of two linear functions as a bilinear form, and differentiate it using the generalized product rule to obtain the desired result. $\endgroup$

void-pointer
– void-pointer

2014-04-19 23:28:52 +00:00
Commented Apr 19, 2014 at 23:28

Add a comment |

Stack Exchange Network

Chain Rule to Compute Second Derivative

1 Answer 1

You must log in to answer this question.

Linked

Hot Network Questions

Chain Rule to Compute Second Derivative

1 Answer 1

You must log in to answer this question.

Linked

Related

Hot Network Questions