Projection on Convex Sets
Contents
10.2. Projection on Convex Sets#
We will assume
We are interested in mapping a point
Main references for this section are [6, 9].
10.2.1. Projection Theorem#
Theorem 10.6 (Projection theorem)
Let
Proof. We fix some
Consider the function
Minimizing
over all is equivalent to minimizing over the setWe note that
is a compact set and is a l.s.c., closed and coercive function.By Weierstrass’ theorem Corollary 8.2, the set of minimizers for
is nonempty and compact.
We now show that the minimizer is unique.
is a strictly convex function because its Hessian matrix is identity matrix which is positive definite.Hence, the minimizer is unique due to Theorem 10.2.
10.2.2. Orthogonal Projection#
Definition 10.6 (Orthogonal Projection Mapping)
Let
This mapping is well defined since the projection is unique
for a nonempty, closed and convex set
The vector
10.2.3. Characterization#
Theorem 10.7 (Orthogonal projection characterization)
Let
This result is also known as the second projection theorem [6].
Proof. Assume that for some
For any
Thus,
Thus,
is indeed the projection of on .
Conversely, assume that
Let
be arbitrary.For any
, define .Then, we have
Viewing
as a function of and differentiating w.r.t. , we haveSince
minimizes over , we must haveThus, we require that
must hold true for every
.
Following is an alternative proof based on results
from Constrained Optimization I.
This proof is specific to the case where
Proof. Define a function
Then, the projection problem can be cast as an optimization problem
Note that the gradient of
By Theorem 10.47,
In other words
We can simplify this as
Theorem 10.8 (Orthogonal projection on an affine subspace)
Let
Proof. Since
By Theorem 10.7,
is the projection of on if and only if for every , we haveBut
if and only if .Hence the condition is equivalent to
But then, it must be an equality since
and both belong to . Thus, we haveIn other words,
.
10.2.4. Distance Function#
Recall that the distance of a point
Theorem 10.9 (Distance function for nonempty, closed and convex set)
Let
Proof. By Theorem 10.6, there exists
a unique point
must hold true.
Theorem 10.10 (Distance function for nonempty, closed and convex set is convex)
Let
Proof. Assume, for contradiction, that
Then, there exist
and such thatLet
and . By definition, .Then,
Since
is convex, hence, .Since,
minimizes the distance of from the point , henceRewriting,
due to triangle inequality.
But, this leads to the contradiction
Hence,
must be convex.
10.2.5. Nonexpansiveness#
Definition 10.7 (Nonexpansiveness property)
Let
In other words, the distance between mapped points in
Theorem 10.11 (Nonexpansive operators are Lipschitz continuous)
Let
Proof. Recall from Definition 3.45 that
if
for every
For a nonexpansive operator, such
Definition 10.8 (Firm nonexpansiveness property)
Let
holds true for every
Theorem 10.12 (A firmly nonexpansive operator is nonexpansive)
Let
Proof. For every
Applying Cauchy Schwartz inequality on R.H.S., we get
Canceling terms, we get:
which is the nonexpansive property.
Theorem 10.13 (Orthogonal projection is nonexpansive)
Let
In other words,
Proof. Let
By Theorem 10.7,
In particular
. Hence,Similarly, starting with
, we obtainAdding these two inequalities, we obtain
By rearranging the terms, we get
Applying the Cauchy Schwartz inequality on the R.H.S., we obtain
Thus,
is nonexpansive.Since
is nonexpansive, hence is continuous also.
Theorem 10.14 (Orthogonal projection is firmly nonexpansive)
Let
In other words,
holds true for every
Proof. Recall from Theorem 10.7 that
for any
Substituting
and , we obtainSubstituting
and , we obtainAdding the two inequalities gives us
as desired.
10.2.6. Squared Distance Function#
Definition 10.9 (Squared distance function to a nonempty set)
Let
We also define
Theorem 10.15 (Expression for
Let
Proof. We proceed as follows.
Expanding on the definition of
Thus,
Thus,
Theorem 10.16 (
Let
Beauty of this result is the fact that
Proof. We proceed as follows.
For every
, the function , given byis an affine function.
is convex for every due to Theorem 9.68.Now,
Thus,
is a pointwise supremum of convex functions.Thus, by Theorem 9.114,
is convex.
Theorem 10.17 (Squared distance function for nonempty, closed and convex sets)
Let
This follows directly from Theorem 10.10.
10.2.7. Gradients and Subgradients#
Theorem 10.18 (Gradient of the squared distance function)
Let
Proof. We proceed as follows.
Let
.Let
.Consider the function
If
then
is indeed the gradient of at .By definition of orthogonal projection, for any
,as
is the nearest point to in . is just another point in .Thus, for any
Recall that for a norm induced by the inner product
Thus,
Putting it back and simplifying, we obtain
Proceeding similarly, we also have
Since
is convex, hence is also convex.Thus,
Thus,
Combining, we have
Or, in terms of absolute values.
Then,
Thus,
Thus,
is indeed the gradient of at .
Theorem 10.19 (Gradient of
Let
Proof. We have
Hence,
Remark 10.5 (Distance function and square distance function relation)
We note that
Theorem 10.20 (Subdifferential of the distance function)
Let
Since
Proof. We can get the subdifferentials for
Recall that
where .Thus, by subdifferential chain rule (Theorem 9.229):
We used the fact that
is nonnegative.Since
is differentiable, hence .If
, then, .Thus, for
For
, .We need to show that
in this case.Consider any
.Then, by subgradient inequality
since
.Then, in particular, for any
Thus,
.
10.2.8. Conjugates#
Theorem 10.21 (Conjugate of norm squared plus indicator)
Let
Then, its conjugate is given by:
Further, if
Proof. Recall from Definition 9.78 that
Expanding, we obtain
The last result is due to Theorem 10.15.
If
10.2.9. Smoothness#
Recall from Definition 9.80 that
a function
Since the norm in this section is induced by the inner product, hence it is self dual.
Thus, the smoothness criteria becomes:
Theorem 10.22 (Smoothness of the squared distance function)
The squared distance function
Proof. Recall the definition of
By Theorem 10.18,
Hence,
Thus,
Thus,
is 1-smooth.
Theorem 10.23 (Smoothness of the
The function
Proof. We recall from Theorem 10.19 that
Hence,
due to the nonexpansiveness property (Theorem 10.13).
10.2.10. POCS Problems#
In this section, we present some example optimization problems which can be converted into an equivalent projection on a convex set problem.
10.2.10.1. Equality Constrained Quadratic Programming#
Quadratic programming problems are discussed extensively in Quadratic Programming. Here we discuss a specific form of minimizing a quadratic function subject to linear equality constraints.
Example 10.1 (Equality constrained quadratic programming)
We consider the quadratic programming problem
where
is the optimization variable. is a given vector. is an matrix of rank . Assume that .
We proceed towards converting this problem into a projection on a convex set problem as follows.
By adding a constant term
to the objective function, we obtain an equivalent problem.The set
is the null space of the matrix which is linear subspace, hence a nonempty, closed and convex set.Minimizing
is equivalent to minimizing subject to . is , the distance between and a point .Thus, we are minimizing the distance of the point
among .This is nothing but the distance of
from the set .Since,
is nonempty, close and convex, hence, there is a unique which minimizes the distance due to the projection theorem.Thus, the solution is the projection of the vector
on the subspace .By Theorem 10.8,
is the unique projection of on if and only if .In other words, $
$A closed form solution to this problem does exist given by
It is indeed the unique solution to this quadratic programming problem.