Differentiability and Convex Functions
Contents
9.9. Differentiability and Convex Functions#
9.9.1. First Order Conditions#
Let us look at the special case of
real valued functions over
Theorem 9.98 (First order characterization of convexity)
Let
Then
holds true for all
Proof. To prove (9.4),
we first show that a differentiable real function
holds true for all
Assume that
Let
.Since
is convex, hence for all .By convexity of
, we have:If we divide by
on both sides, we obtain:Taking the limit as
, we obtain:
For the converse, assume that
holds true for all
Recall that in
the only convex sets are intervals. Thus, is an open interval.Choose any
such that .Choose
.Let
.By hypothesis, we have:
and
Multiplying the first inequality with
and second with and adding them yields:Thus,
is convex.
We now prove for the general case with
Note that, by chain rule (Example 5.11):
Assume
Let
such that .Let
be the restriction of on the line passing through as described above.Due to Theorem 9.70,
is convex.By the argument for real functions above:
holds true for all
.In particular, with
and , we have:But
.Also,
and .Thus, we get:
as desired.
For the converse, assume that this inequality holds
for all
Pick some
with .Let
be the restriction of on the line passing through as described above.Pick
.Then,
and are in .Consider
and .Note that
.By hypothesis, we have:
But
.Thus, we get:
This holds for every
.But then,
is convex by previous argument for real functions.Since this is valid for every restriction of
to a line passing through its domain, hence by Theorem 9.70 is convex.
9.9.2. Second Order Conditions#
For functions which are twice differentiable, convexity can be expressed in terms of the positive-semidefiniteness of their Hessian matrices.
We start with a result on convexity of real functions on open intervals.
Theorem 9.99 (Convexity characterization for twice differentiable real functions on open intervals)
Let
Then,
Proof. Assume that
Then,
is nondecreasing on .For any
with and , let .We have
; i.e. . Consequently,Since
and , we haveWe wish to eliminate
from these inequalities.Multiplying the two inequalities by
and respectively, and adding them together, we obtain:But
.Thus,
.This inequality is valid for the case where
also.Thus,
is convex over .
For the converse, assume that
Then, since
is continuous in , hence is negative in some subinterval .Choose
such that . Choose some .Following an argument parallel to above, we have
Thus, there exist
where the inequality (9.1) is not valid.Consequently,
is non-convex.
We continue further with
real valued functions over
Theorem 9.100 (Second order characterization of convexity in Euclidean spaces)
Let
Then,
Proof. The convexity of
We first note that if
Consequently, for any
Let
.Let
be an arbitrary (nonzero) direction.Let
be a line passing through in the direction .Consider the open real interval
. Since is an open line segment in , hence is indeed an open interval in .Consider the parameterized restriction of
on the open interval as:A simple calculation shows that
where
.By Theorem 9.99,
is convex for each and nonzero if and only if for every and .Thus,
is convex if and only if .
For real functions, the Hessian is simply the
second derivative
Corollary 9.6 (Second order characterization of concavity)
Let
Then,
Example 9.40 (Convexity of a quadratic function)
Let
As shown in Example 5.13,
the Hessian of
Thus,
In fact
Example 9.41 (Identity is convex and concave)
Let
We have
Example 9.42 (Exponential is convex)
Let
with
We have
For any
Example 9.43 (Powers)
Let
with
Now,
We have
.For
, . is convex for .For
, . Thus, . is convex for .For
, . Thus, . is concave on .
Example 9.44 (Reciprocal powers)
Let
with
Now,
We have
.For
, . is convex for .
Example 9.45 (Logarithm is concave)
Let
with
Now,
for all .Thus,
is concave for all .
Example 9.46 (Negative entropy is convex)
Let
with
Now,
for all .Thus,
is convex for all .
Example 9.47 (Quadratic over linear form is convex)
Let
with
From Example 5.16, the Hessian is:
Recall that for any
is positive semi-definite.
For
Thus,
Example 9.48 (Log sum exponential is convex)
Let
with
From Example 5.14, we have
where
To show that
Now for any
If we define vectors
But this is exactly the expression above.
Thus,
Hence,
Example 9.49 (Log determinant function is concave)
Let
with
Let any line in
where
Consider the restriction of
to the interval of values where
Without any loss of generality, we can assume that
Recall that:
for square matrices. for symmetric matrices with being their eigen values.If
are eigen values of , then the eigen values of are .
Now
Let
be the eigen values of .Then,
are eigen values of .Thus,
.
Thus,
Note that
Differentiating
Differentiating again, we get:
Since