# 数学分析笔记——Higher-order derivatives

## 数学分析笔记——Higher-order derivatives

We give the natural extension of the theory of differentiability to higher-order derivatives. Organizing the notations turns out to be the main part of the work.

Let $U$ be an open subset of $\mathbb{R}^n$ and consider a differentiable mapping $f:U\to \mathbb{R}^p$. Then we have for the derivative $Df : U \to\mathcal{L}\left(\mathbb{R}^n,\mathbb{R}^p\right)\simeq \mathbb{R}^{pn}$. If, in turn, $Df$ is differentiable, then its derivative $D^2 f := D(Df )$, the second-order derivative of $f$, is a mapping $U \to\mathcal{L}\left(\mathbb{R}^n,\mathcal{L}\left(\mathbb{R}^n,\mathbb{R}^p\right)\right)\simeq \mathbb{R}^{pn^2}$. When considering higher-order derivatives, we quickly get a rather involved notation. This becomes more manageable if we recall some results in linear algebra.

Definition. We denote by $\mathcal{L}^k\left(\mathbb{R}^n,\mathbb{R}^p\right)$ the linear space of all $k$-linear map-pings from the $k$-fold Cartesian product $\mathbb{R}^n\times\cdots\times\mathbb{R}^n$ taking values in $\mathbb{R}^p$. Thus $T\in\mathcal{L}^k\left(\mathbb{R}^n,\mathbb{R}^p\right)$ if and only if $T:\mathbb{R}^n\times\cdots\times\mathbb{R}^n\to\mathbb{R}^p$ and $T$ is linear in each of the $k$ variables varying in $\mathbb{R}^n$, when all the other variables are held fixed.

Lemma. There exists a natural isomorphism of linear spaces
$$\mathcal{L}\left(\mathbb{R}^n,\mathcal{L}\left(\mathbb{R}^n,\mathbb{R}^P\right)\right)\simeq\mathcal{L}^2\left(\mathbb{R}^n,\mathbb{R}^p\right),$$
given by $\mathcal{L}\left(\mathbb{R}^n,\mathcal{L}\left(\mathbb{R}^n,\mathbb{R}^p\right)\right)\ni S\leftrightarrow T\in\mathcal{L}^2\left(\mathbb{R}^n,\mathbb{R}^p\right)$ if and only if $S(h_1)h_2=T(h_1,h_2)$, for $h_1,h_2\in\mathbb{R}^n$.

More generally, there exists a natural isomorphism of linear spaces
$$\mathcal{L}\left(\mathbb{R}^n,\mathcal{L}\left(\mathbb{R}^n,\cdots,\mathcal{L}\left(\mathbb{R}^n,\mathbb{R}^p\right)\cdots\right)\right)\simeq\mathcal{L}^k\left(\mathbb{R}^n,\mathbb{R}^p\right)\quad(k\in\mathbb{N}^*\setminus\{1\}),$$
with $S\leftrightarrow T$ if and only if $\left(\cdots\left(\left(Sh_1\right)h_2\right)\cdots\right)h_k=T(h_1,h_2,\cdots,h_k)$, for $h_1,\cdots,h_k\in\mathbb{R}^n$.

Corollary. For every $T\in\mathcal{L}^k\left(\mathbb{R}^n,\mathbb{R}^p\right)$ there exists $c>0$ such that, for all $h_1,\cdots,h_k\in\mathbb{R}^n$,
$$\left\|T\left(h_1,\cdots,h_k\right)\right\|\leq c\left\|h_1\right\|\cdots\left\|h_k\right\|.$$
Proposition. If $T\in\mathcal{L}^k\left(\mathbb{R}^n,\mathbb{R}^p\right)$ then $T$ is differentiable. Furthermore, let $\left(a_1,\cdots,a_k\right)$ and $\left(h_1,\cdots,h_k\right)\in\mathbb{R}^n\times\cdots\times\mathbb{R}^n$, then $DT\left(a_1,\cdots,a_k\right)\in\mathcal{L}\left(\mathbb{R}^n\times\cdots\times\mathbb{R}^n,\mathbb{R}^p\right)$ is given by
$$DT\left(a_1,\cdots,a_k\right)\left(h_1,\cdots,h_k\right)=\sum\limits_{1\leq i\leq k}T\left(a_1,\cdots,a_{i-1},h_i,a_{i+1},\cdots,a_k\right).$$
Definition. Let $f:U\to\mathbb{R}^p$, with $U$ open in $\mathbb{R}^n$. Then the notation $f\in C^0\left(U,\mathbb{R}^p\right)$ means that $f$ is continuous. By induction over $k\in\mathbb{N}^*$ we say that $f$ is a $k$ times continuously differentiable mapping, notation $f\in C^k\left(U,\mathbb{R}^p\right)$, if $f\in C^{k-1}\left(U,\mathbb{R}^p\right)$ and if the derivative of order $k-1$
$$D^{k-1}f:U\to\mathcal{L}\left(\mathbb{R}^n,\mathbb{R}^p\right)$$
is a continuously differentiable mapping. We write $f\in C^{\infty}\left(U,\mathbb{R}^p\right)$ if $f\in C^k\left(U,\mathbb{R}^p\right)$, for all $k\in\mathbb{N}^*$. If we want to consider $f\in C^k$ for $k\in\mathbb{N}^*$ or $k=\infty$, we write $f\in C^k$ with $k\in\mathbb{N}^*_\infty$. In the case where $p = 1$, we write $C^k\left(U\right)$ instead of $C^k\left(U,\mathbb{R}^p\right)$.

Rephrasing the definition above, we obtain at once that $f\in C^k\left(U,\mathbb{R}^p\right)$ if $f\in C^{k-1}\left(U,\mathbb{R}^p\right)$ and if, for all $i_1,\cdots,i_{k-1}\in\{1,\cdots,n\}$, the partial derivative of order $k-1$
$$D_{i_{k-1}\cdots}D_{i_1}f$$
belongs to $C^1\left(U,\mathbb{R}^p\right)$. The latter condition is satisfied if all the partial derivatives of order $k$ satisfy
$$D_{i_k}D_{i_{k-1}\cdots}D_{i_1}f:=D_{i_k}\left(D_{i_{k-1}\cdots}D_{i_1}f\right)\in C^0\left(U,\mathbb{R}^p\right).$$
Clearly, $f\in C^k\left(U,\mathbb{R}^p\right)$ if and only if
$$Df\in C^{k-1}\left(U,\mathcal{L}\left(\mathbb{R}^n,\mathbb{R}^p\right)\right).$$
This can subsequently be used to prove, by induction over $k$, that the composition $g\circ f$ is a $C^k$ mapping if $f$ and $g$ are $C^k$ mappings. Indeed, we have $D(g\circ f)=((Dg)\circ f)\circ Df$ on account of the chain rule. Now, $(Dg)\circ f$ is a $C^{k-1}$ mapping, being a composition of the $C^k$ mapping $f$ with the $C^{k-1}$ mapping $Dg$, where we use the induction hypothesis. Furthermore, $Df$ is a $C^{k-1}$ mapping, and composition of linear mappings is infinitely differentiable. Hence we obtain that $D(g\circ f)$ is a $C^{k−1}$ mapping.

Theorem. If $U\subset\mathbb{R}^n$ is open and $f\in C^2\left(U,\mathbb{R}^p\right)$, then the bilinear mapping $D^2f(a)\in\mathcal{L}^2\left(\mathbb{R}^n,\mathbb{R}^p\right)$ satisfies
$$D^2f(a)(h_1,h_2)=(D_{h_1}D_{h_2}f)(a)\quad(a,h_1,h_2\in\mathbb{R}^n).$$
Furthermore, $D^2 f (a)$ is symmetric, that is,
$$D^2f(a)(h_1,h_2)=D^2f(a)(h_2,h_1)\quad(a,h_1,h_2\in\mathbb{R}^n).$$
More generally, let $f\in C^k\left(U,\mathbb{R}^p\right)$, for $k\in\mathbb{N}^*$. Then $D^k f(a)\in\mathcal{L}^k\left(\mathbb{R}^n,\mathbb{R}^p\right)$ satisfies
$$D^kf(a)(h_1,\cdots,h_k)=(D_{h_1}\cdots D_{h_k}f)(a)\quad(a,h_i\in\mathbb{R}^n,1\leq i\leq k).$$
Moreover, $D^k f (a)$ is symmetric, that is, for $a,h_i \in \mathbb{R}^n$ where $1 \leq i \leq k$,
$$D^kf(a)(h_1,h_2,\cdots,h_k)=D^kf(a)(h_{\sigma_{(1)}},h_{\sigma_{(2)}},\cdots,h_{\sigma_{(k)}}),$$
for every $\sigma\in S_k$, the permutation group on $k$ elements, which consists of bijections of the set $\{1,2,\cdots,k\}$.