5.2. Differentiation in Banach Spaces#

We introduce the concept of differentiation in Banach spaces. Recall that Banach spaces are normed linear spaces that are complete.

5.2.1. Gateaux Differential#

Definition 5.14 (Directional derivative)

Let X and Y be Banach spaces. Let f:XY be a function with S=domf. The directional derivative of f at xintS in the direction hX where h0, denoted by f(x;h) is given by

f(x;h)limt0+f(x+th)f(x)t

whenever the limit exists. This is also known as the Gateaux differential. By convention, f(x;0X)=0Y. This is consistent with the definition above.

  • There is no single directional derivative at a point x.

  • The directional derivative depends on the direction h.

  • In one dimension, there are two directional derivatives at each x.

  • In two or more dimensions, there are infinitely many directional derivatives.

  • The directional derivative is a one dimensional calculation along the direction h.

  • It is usually easy to compute the directional derivative even when the space X is infinite dimensional.

Definition 5.15 (Gateaux differentiability)

Let X and Y be Banach spaces. Let f:XY be a function with S=domf. Let US be an open set. We say that f is Gateaux differentiable at xU if the Gateaux differential f(x;h) exists for every direction hX.

Accordingly, we can define a bounded operator Tx:XY given by

Tx(h)limt0+f(x+th)f(x)thX.

The operator T is called the Gateaux derivative of f at x.

Example 5.17 (Gateaux differential of exponential function)

Let f(x)=ex. Then,

f(x;h)=limt0+ex+thext=exlimt0+eth1t=exlimt0+tht=hex.

We note that the Gateaux derivative depends linearly on h.

Theorem 5.7 (Gateaux differential nonnegative homogeneity)

The Gateaux differential of a function f:XY is nonnegative homogeneous in the sense that

f(x;αh)=αf(x;h)

for every αR+ and every hX.

However, the Gateaux differential may not be additive. Thus, the Gateaux differential may fail to be linear.

Example 5.18 (Gateaux differential of absolute value function)

Let f(x)=|x|. Then, the Gateaux differentials are given by

f(x;h)={hx|x|x0;|h|x=0.

We note that the Gateaux differential of f exists everywhere. However the Gateaux differential depends on h in a nonlinear way at x=0. At x0, the Gateaux differential depends linearly on h.

Example 5.19 (Gateaux differential of square function)

Let f(x)=x2. Then, the Gateaux differential is given by

f(x;h)=limt0+f(x+th)f(x)t=limt0+x2+t2h2+2xthx2t=2xh.

We note that the Gateaux differential is linear w.r.t. h.

Example 5.20 (Gateaux differential of linear functional)

Let f(x)=aTx where aRn is a given fixed vector.

f(x;h)=limt0+aTx+taThaTxt=aTh.

We note that the Gateaux differential is linear w.r.t. h.

Example 5.21 (Gateaux differential of simple quadratic)

Let f(x)=xTAx where ASn is a given symmetric matrix.

f(x;h)=limt0+(x+th)TA(x+th)xTAxt=limt0+t2hTAh+2thTAxt=2hTAx=2xTAh.

We note that the Gateaux differential is linear w.r.t. h.

In particular, if f(x)=xTx, then f(x;h)=2hTx=2xTh.

Theorem 5.8 (Gateaux differential of a constant function)

The Gateaux differential of a constant function is zero.

Theorem 5.9 (Gateaux differential sum rule)

Gateaux differential distributes over sum.

Let f,g:XY both have Gateaux derivatives at x in the direction h. Then,

(f+g)(x;h)=f(x;h)+g(x;h).

Also,

(fg)(x;h)=f(x;h)g(x;h).

Theorem 5.10 (Gateaux differential product rule)

Let f,g:XY both be Gateaux differentiable at xintdomfdomg. Let h be their (pointwise) product function given by

h(x)=f(x)g(x)

with domh=domfdomg. Then,

h(x;h)==(fg)(x;h)=f(x;h)g(x)+g(x;h)f(x).

Theorem 5.11 (Gateaux differential chain rule)

Let f:XY and g:YZ be functions. Let h:XZ be the composition of f and g given by h=gf. Let Udomh be an open set. Let xU. Assume that f is Gateaux differentiable at x and g is Gateaux differentiable at f(x). Then,

h(x;h)=g(f(x);f(x;h))hX.

We recall the little-o notation. We say that a quantity q is o(t) if

limt0+qt=0.

For vector valued functions, a quantity q is o(t) if

limt0+qt=0.

or

limt0+qt=0.

Proof. If f is Gateaux differentiable at x, then

f(x;h)=limt0+f(x+th)f(x)thX.

In terms of little-o notation,

f(x+th)=f(x)+tf(x;h)+o(t).

Similarly, if g is Gateaux differentiable at y, then

g(y+su)=g(y)+sg(y;u)+o(s).

Now,

h(x;h)=(gf)(x;h)=limt0+g(f(x+th))g(f(x))t=limt0+g(f(x)+tf(x;h)+o(t))g(f(x))t=limt0+g(f(x)+t(f(x;h)+t1o(t)))g(f(x))t=limt0+g(f(x))+tg(f(x);f(x;h)+t1o(t))+o(t)g(f(x))t=limt0+tg(f(x);f(x;h)+t1o(t))+o(t)t=limt0+[g(f(x);f(x;h))+t1o(t))+t1o(t)]=g(f(x);f(x;h)).

Example 5.22 (Chain rule for square of inner product)

Consider the function h(x)=(xTx)2.

  1. Define g(t)=t2

  2. Define f(x)=xTx.

  3. Then h=gf.

  4. We have f(x;h)=2hTx.

  5. We have g(y;u)=2yu.

  6. Thus,

    g(f(x);f(x;h))=2f(x)f(x;h)=2(xTx)(2hTx)=4(hTx)(xTx).

We can compute the same thing using the product rule.

  1. We note that h(x)=f(x)f(x).

  2. Applying the product rule:

    h(x;h)=f(x;h)f(x)+f(x;h)f(x)=2f(x;h)f(x)=2(2hTx)(xTx)=4(hTx)(xTx).

5.2.2. Fréchet Derivative#

Definition 5.16 (Fréchet differentiability)

Let X and Y be Banach spaces. Let f:XY be a function with S=domf. Let US be an open set. We say that f is Fréchet differentiable at xU if there is a bounded and linear operator Tx:XY given by

Tx(h)=limt0+f(x+th)f(x)thX.

The operator Tx is called the Fréchet derivative of f at x.

We note that Tx depends on x.

Remark 5.1 (Fréchet differentiability alternate forms)

By definition, if f is Fréchet differentiable at x, then it is Gateaux differentiable at x. Since Tx is linear, we can write it as

Tx(h)=Ah

emphasizing the fact that the essential part of Tx doesn’t depend on h. A may still depend on x.

Using the little-o notation, we can write

f(x+th)=f(x)+tTx(h)+o(t)=f(x)+tAh+o(t).

If we set th=y, then t0 if and only if y0. In particular, yX=thX=o(t). Now,

f(x+y)=f(x)+Ay+o(t)f(x+y)f(x)Ay=o(t)=o(yX)limyX0f(x+y)f(x)AyYyX=0limy0f(x+y)f(x)AyYyX=0limy0f(x+y)f(x)Tx(y)YyX=0.

Therefore f:XY is Fréchet differentiable at xU if and only if

limy0f(x+y)f(x)Tx(y)yX=0

for every yX.

It is worthwhile to compare this definition to the definition of differentiability of f:RnRm in Definition 5.1. If we put z=x+y, we can rewrite the condition as

limzxf(z)f(x)Tx(zx)YzxX=0.

Thus, Tx plays the same role as the Jacobian matrix Df(x) in (5.1).

Theorem 5.12 (Existence of Fréchet derivative)

The Fréchet derivative of a function f exists at a point x=a if and only if all Gateaux differentials of f at x are continuous functions of x at x=a.

Theorem 5.13 (Uniqueness of Fréchet derivative)

If the Fréchet derivative of a function f exists at a point x=a then it is unique.

5.2.3. Gradient#

Definition 5.17 (Gradient)

Let V be a Hilbert space. Let f:VR is a real valued function. Let S=domf and US be an open set. Assume that f is Fréchet differentiable at xU. Then, the Fréchet derivative Tx:VR is a bounded linear functional.

The gradient of a real valued function is denoted by f(x) and f(x)V satisfying

h,f(x)=Tx(h).