Basic Duality
Contents
10.4. Basic Duality#
10.4.1. The Geometry of the Space#
As we concern ourselves with the optimization of proper functions
of type
The
part forms the horizontal axes for the product space .The
part forms the vertical axis.We write the vectors in
as where and .A hyperplane in
is associated with a nonzero normal vector of the form where and .Since the normal vector must be nonzero, hence either
or or both must be nonzero.If a specific point
, then
10.4.1.1. Different Types of Hyperplanes#
Definition 10.10 (Vertical, horizontal and nonvertical hyperplanes)
Let
The hyperplane is called horizontal if
.The hyperplane is called vertical if
.The hyperplane is called nonvertical if
.
Let us see how these definitions can be interpreted.
10.4.1.2. Horizontal Hyperplanes#
Example 10.2 (Horizontal hyperplanes)
Consider the case where
The hyperplane description reduces to
It simplifies to
since
must be nonzero.Along the
axes, the points in set can take any value but along the vertical axis, they must take a fixed value given by .We can see that
is a hyperplane which is parallel to .For the specific case where
, .Hence they are called horizontal hyperplanes.
Note that
intersects with the vertical axis at the point .
10.4.1.3. Vertical Hyperplanes#
Example 10.3 (Vertical hyperplanes)
Now consider the case where
The hyperplane description reduces to
The set
describes a hyperplane of . is constructed by allowing to slide along the vertical axis as any value is allowed in the last coordinate (vertical axis).Hence this is called a vertical hyperplane.
10.4.1.4. Nonvertical Hyperplanes#
Observation 10.3 (Intersection of nonvertical hyperplane with vertical axis)
If a hyperplane
Indeed the vertical axis is identified with the set of points
The hyperplane is given by
Then the point of intersection is given by
whereIn particular, assume that
.Then by definition of
,Hence
10.4.1.5. Vertical Lines#
Definition 10.11 (Vertical line)
A vertical line in
A vertical hyperplane is a union of vertical lines.
A vertical halfspace is also a union of vertical lines.
If
is a proper function, then its epigraph doesn’t contain a vertical line.
It enables us to wonder if the epigraph of a proper function is contained entirely in a closed halfspace corresponding to some nonvertical hyperplane.
10.4.1.6. Nonvertical Supporting Hyperplanes#
Theorem 10.30 (Convex sets and nonvertical supporting hyperplanes)
Let
is contained in a closed halfspace corresponding to a nonvertical hyperplane; i.e., there exists a vector , a nonzero scalar , and a scalar such thatIf a point
doesn’t belong to , there exists a nonvertical hyperplane strongly separating and .
Proof. (1) We prove this claim by contradiction.
Assume that every hyperplane such that
is contained in one of its closed half-spaces is vertical.Then every hyperplane such that
is contained in one of its closed half spaces is vertical.By Theorem 9.170,
is the intersection of all closed half-spaces that contain it.Hence, we have
where
is an index set for the family of the hyperplanes that contain in its closed half spaces above. Since the hyperplanes are vertical, hence for ever .Let
.Then the vertical line
also belongs to as it satisfies the expression above.Hence, the vector
is a direction of recession for . In fact it belongs to its lineality space. In other words, is also a direction of recession for .By Theorem 9.150,
since
is a convex set.By Theorem 9.182, the recession cones of
and are equal.Hence
and are also directions of recession for .Hence for every
, the vertical line also belongs to and hence to .This contradicts the hypothesis that
doesn’t contain a vertical line.
(2) This follows from strict separation theorem.
Since
, hence, due to Theorem 9.169, there exists a hyperplane strongly separating and .If this hyperplane is nonvertical, then there is nothing more to do.
We now consider the case where this strongly separating hyperplane is vertical.
Let this separating hyperplane be given by
such that
Our goal is to combine this vertical hyperplane with a suitably constructed nonvertical hyperplane so that the combined nonvertical hyperplane also strongly separates
and .By part (1), there exists a nonvertical hyperplane such that
is totally contained in one of its closed half spaces.Hence there exists
with and such thatMultiply this inequality with some
and add with the previous inequality to getSince
, it is possible to select a small enough such thatThus, for the small enough
, we haveLetting
, and , we haveThis describes the strongly separating nonvertical hyperplane between
and .
Definition 10.12 (Upper closed halfspace)
Let
If you are in the upper closed halfspace and keep going up, you will stay in the upper closed halfspace. If you go down, you will hit the hyperplane.
Definition 10.13 (Lower closed halfspace)
Let
If you are in the lower closed halfspace and keep going down, you will stay in the lower closed halfspace. If you go up, you will hit the hyperplane.
10.4.2. Min Common/Max Crossing Duality#
We introduce two simple optimization problems.
Consider a set
in .Assume that
intersects with the vertical axis; i.e., .The min common point problem attempts to find the point
whose component along vertical axis is the minimum.Now consider the set of nonvertical hyperplanes such that
lies in their upper closed halfspaces.Each such hyperplane intersects with the vertical axis.
Such points are called the crossing points between the nonvertical hyperplane and the vertical axis.
Consider the set of all such crossing points.
The max crossing point problem attempts to find the point whose vertical component is the largest (highest).
Let
be the minimum common point.Let
be the maximum crossing point.Let
and denote the component along the vertical axis for and respectively.In general,
lies above . We shall show this later formally.We call
as minimum common level and as maximum crossing level.Then
. This is known as weak duality.The gap
which is a nonnegative quantity is known as the duality gap.Under certain conditions
. In other words, if the set meets specific conditions, then the optimal value for the min common point problem (the primal problem) as well as the max crossing point problem (the dual problem) are identical.When
, then the duality gap is . This is known as strong duality.When strong duality holds, then the min common point problem and the max crossing point problem are equivalent in the sense that they have the same solution.
We are now ready to define these problems formally.
10.4.2.1. Min Common Problem#
Definition 10.14 (Min common problem)
Given a set
Its optimal value is denoted by
10.4.2.2. Max Crossing Problem#
Recall that a nonvertical hyperplane
has a normal
where
The corresponding upper closed half halfspace is given by
since
If the set
Hence, the maximum crossing level
The problem of maximizing the crossing level over all
nonvertical hyperplanes is to maximize over all
Definition 10.15 (Max crossing problem)
Given a set
where
The function
Theorem 10.31 (Concavity and upper semicontinuity of
The cost function
Proof. Concavity
For each
, the function is an affine function.Thus
is an infimum of a family of affine functions over .Hence
is concave.
Upper semicontinuity
We note that
.Hence
is a supremum of affine functions.Affine functions are closed (Theorem 9.86).
Pointwise supremum of closed functions is closed (Theorem 3.93).
Hence
is a closed function.By Theorem 3.119,
is lower semicontinuous since it is closed.Hence
is upper semicontinuous.
10.4.2.3. Weak Duality#
Theorem 10.32 (Weak duality theorem)
For the min common and max crossing problems we have
Proof. Recall that
Then
Thus we have
Taking the supremum over
10.4.2.4. Strong Duality#
We are now interested in conditions under which
there is no duality gap and
Remark 10.6 (Optimal point of min common problem is a closure point)
Assume that the min common problem is feasible and let
We have
.Thus for every
there exists such that .Thus for every
, there exists a point such that .Hence
is a closure point of .
Observation 10.4 (Closed and convex
If
Since
is closed, hence since is a closure point of .Hence the optimal value of min common problem is attained at
.Let
be the nonvertical supporting hyperplane at .Then intersection of
with the vertical axis is at .Hence
.But by weak duality
.Hence
.Clearly the optimal value of max crossing problem is attained at
for the hyperplane .
This is the most favorable case where strong duality holds as well as the optimal values of both problems are attained. We next provide a result that provides a necessary and sufficient condition for the strong duality however does not address the issue of attainment of the optimal values.
10.4.2.5. First Min Common / Max Crossing Theorem#
Theorem 10.33 (Min common/max crossing theorem I)
Consider the min common and max crossing problems. Assume the following:
; i.e., the min common problem is feasible.The set
is convex.
Then we have
The set
Proof. We first consider the trivial case where
By weak duality (Theorem 10.32),
.Hence
.Hence
for every .The conclusion follows trivially.
We now consider the general case where
Since
and is a closure point of , hence is also a closure point of .We first claim that the set
doesn’t contain any vertical lines.For contradiction, assume that
contains a vertical line.The set
also contains a vertical line then.Then
is a direction of recession of .Hence
is also a direction of recession of .Since
is a closure point of , hence it is also a closure point of .Hence, there exists a sequence
of converging to .Since
is a direction of recession of , hence the sequence belongs to .Hence its limit
.By definition of
, for every , there exists a point such that .Hence there exists a sequence
of with for all so that .This contradicts the assumption that
since .
We next show that the vector
does not belong to for any .Assume for contradiction that for some
, the vector .Then there is a sequence
of converging to .By definition of
, for every , there exists a point such that .Hence there exists a sequence
of with for all so that .This contradicts the assumption that
since .
Since
does not contain any vertical lines and the vector doesn’t belong to , hence, due to Theorem 10.30, there exists a nonvertical hyperplane strongly separating and for every .This hyperplane crosses the vertical axis at a unique vector
which must lie between and ; i.e., .By definition of max crossing problem,
.Hence we have
for every .Since
can be arbitrarily small, it follows that .
Conversely, we assume that the strong duality holds.
Let
be any sequence of such that .By definition of
,Taking the limit on R.H.S. to
, we obtainTaking the supremum on the L.H.S. over
, we have
This result doesn’t guarantee the attainment of either the min common optimal point or the max crossing optimal point. The next result includes additional conditions which ensures that the optimal point of max crossing problem is attained.
10.4.2.6. Second Min Common / Max Crossing Theorem#
Theorem 10.34 (Min common/max crossing theorem II)
Consider the min common and max crossing problems. Assume the following:
.The set
is convex.
The set
contains the origin in its relative interior.
Then
has the form
where
Furthermore,
Note that
Proof. We first show that the strong duality holds and
Condition (3) implies that there exists some
.By definition of
, there exists some such that .Hence
.Then, by condition (1),
; i.e., the min crossing level is a real number.The vertical axis
belongs to . is the optimal min common value.Hence
is not a relative interior point of .Accordingly,
.By Theorem 9.163, there exists a hyperplane that separates
and the point properly; i.e., it contains the point , contains in one of its half-spaces and doesn’t contain fully.Hence, there exists a vector
such thatand
See Remark 9.10.
Since for every
, the set contains the half-line , it follows from the first inequality that must hold true. leads to a contradiction.Assume that
.Then the first inequality reduces to
The linear functional
attains its minimum over the set at which is a relative interior point of by condition (3).Since
is convex (projection of convex on ), and the linear functional (a concave function) attains its minimum at a relative interior point, hence, due to Theorem 10.5, the function must be constant over .Hence we have
But this contradicts the second (strict) inequality of the proper separation result since
cannot be contained entirely inside the hyperplane.We arrive at a contradiction.
Hence
and the separating hyperplane is nonvertical.
By appropriate normalization, if necessary, we can assume that
.The proper separation inequalities simplify to
and
We now note that
Since by weak duality, we always have
, hence all the inequalities must be equalities in the previous relation.Thus we have
Hence
is nonempty as .We can also write
asSince
is concave and upper semicontinuous, hence its superlevel sets are closed and convex.Hence
being a superlevel set of is convex and closed.
We next show that
TBD
10.4.3. Minimax and Maximin Problems#
We consider functions of the form
Definition 10.16 (Minimax problem)
A minimax problem takes the form
Definition 10.17 (Maximin problem)
A maximin problem takes the form
We next provide a few examples.
10.4.3.1. Zero Sum Games#
Example 10.4 (Zero sum games)
We consider a two player game with the following design.
Player A can choose one out of
possible moves.Player B can choose one out of
possible moves.Both players make their moves simultaneously.
is a payoff matrix.If move
is selected by player A and move is selected by player B, then A gives the specified amount to B.Note that
can be zero, positive or negative.The players use mixed strategy.
Player A selects a probability distribution
over her possible moves.Player B selects a probability distribution
over her possible moves.Since the probability of selecting move
by player A and move by player B is , hence the expected amount to be paid by A to B isEach player adopts a worst case viewpoint, whereby she optimizes her choice against the worst possible selection by the other player.
Player A must minimize
so that she has to pay as low as possible to B.Suppose A selects the strategy
.The amount she has to pay to B for B’s selection of strategy
is .The maximum amount she has to pay across all possible strategies chosen by B is
.By selecting
, her goal is to minimize the maximum payoff.
Player B must maximize
.Suppose B selects a strategy
.Then the payoff she gets by a strategy
of A is .The minimum she can get for her choice of
is .By selecting
her goal is to maximize the minimum payoff.
Here
, the unit simplex of .Similarly,
, the unit simplex of . .The worst case pay off for player A is
The worst case pay off for player B is
Under suitable conditions, we can guarantee that
10.4.3.2. Lagrangian Functions#
Example 10.5 (Lagrangian functions and duality theory)
Consider the optimization problem of the form
where
We construct the Lagrangian function
with
We construct the primal problem as below:
Choose any
.If any of the constraints
is violated, then and hence . We can easily achieve this by taking .If none of the constraints are invalidated, then by picking
, we have .Hence the problem is equivalent to the original problem.
We can now construct the dual problem as below:
Thus the primal problem is a minimax problem
and the dual problem is a maximin problem
with the Lagrangian playing the role of
Under suitable conditions, the two problems have equal optimal value.
10.4.3.3. Minimax Equality#
In the following, we will explore conditions under which
and the infimum and the supremum are attained.
The key result is the saddle point theorem
which guarantees the equality in (10.4)
as well as the attainment of the infimum/supremum assuming
convexity/concavity on
The compactness assumptions are restrictive.
For example, in the duality theory, the
Lagrange multipliers
The minimax theorem gives conditions guaranteeing the minimax equality (10.4), although it need not guarantee the attainment of the infimum and the supremum.
10.4.3.4. Minimax Inequality#
Observation 10.5 (Minimax inequality)
We always have
In other words, the optimal value of the maximin problem is less than or equal to the optimum value of the minimax problem.
Proof. We note that for every
Taking the supremum on the L.H.S., we obtain the desired inequality.
Thus, in order to show (10.4), it is sufficient to show that the reverse inequality holds.
10.4.3.5. Saddle Points#
Definition 10.18 (Saddle point)
A pair of vectors
Remark 10.7 (Saddle point inequalities)
The pair
This equality further implies that
In short
Combined with the minimax inequality (10.5), it shows that if a saddle point exists, then the minimax equality (10.4) holds.
Theorem 10.35 (Saddle point = minimax equality)
A pair
while
Proof. Suppose that
From the minimax problem we obtain
From the maximin problem we obtain
Combining these inequalities, we have
If the minimax equality (10.4) holds, then equality holds throughout above giving us
Hence
is a saddle point of as per Remark 10.7.
Conversely, assume that
From Remark 10.7, we have
The minimax inequality (10.5) gives us
Thus, all the inequalities in the previous relationship must be equalities. We have
Hence the minimax equality holds.
It also implies that
is an optimal solution of the minimax problem and is the optimal solution of the maximin problem.
Observation 10.6 (Saddle points as a Cartesian product)
When the set of saddle points is nonempty, it can be written
as a Cartesian product
In other words,
If the minimax equality (10.4) does not hold, then there is no saddle point even if the minimax and maximin problems have optimal solutions.
10.4.4. Min Common/Max Crossing Framework for Minimax#
Recall that the minimax problem is given by
We add a linear perturbation to
We can see that
The linear perturbation term impacts the optimum value of the
minimax problem.
We shall show that if
10.4.4.1. Framework Definition#
Definition 10.19 (Min common / max crossing framework for minimax problem)
We define the set
where
Recall that min common value is given by
Note that for
, contains all the points such that .In particular,
and if then .Hence
is given bySince
is an epigraph, hence the sets and are identical in the min common/max crossing framework.If
is convex, then is convex as desired in several results of the min common/max crossing framework.
The corresponding max crossing problem is given by:
We note that
(10.7)#Its optimal value is denoted by
; i.e.,
10.4.4.2. Connection#
Observation 10.7 (Connection between minimax equality and min common/max crossing framework)
By putting the definition of
in the expression for , we obtainIn particular, for every
, if we set in this relation, we can see thatFrom the weak duality principle, we have
.We have established that
.We can now see that
This is nothing but the minimax inequality (10.5).
We can see that if the minimax equality (10.4) holds, then all inequalities in the previous relation turn into equalities and we have
.In other words, if the minimax equality holds, then the optimal values of the min common and max crossing problems are equal.
10.4.4.3. Convexity of #
Lemma 10.2 (Convexity of
Let
Proof. Recall that
Fix some
.Consider the function
.Clearly,
is convex for each by hypothesis.Taking the pointwise supremum over
, the functionis also convex (over
and ).We now have
Since partial minimization preserves convexity, hence
is convex.
10.4.4.4. Minimax Equality Strong Duality Equivalence Conditions#
Lemma 10.3 (Closedness and convexity of
Let
where
Furthermore, we have
Proof. We shall show the following one by one.
. . .
(1)
We have already established in (10.7) that
By using the definition of
, we further established thatBy rearranging the order of infimum operations, we have
For any
we haveThis in turn implies that
(2)
Let
be given byBy hypothesis
is closed and convex.Let
.Fix some
.Since
is a closed and convex function, hence is a closed and convex set.Since
, hence the point .For some
, consider the point .Clearly
.By definition of
, is finite for all , is nonempty and is closed.Since
is finite for all , the epigraph doesn’t contain any vertical lines.Hence, due to Theorem 10.30, there exists a nonvertical hyperplane that strongly separates the point
from .Hence there exists a normal vector
and a scalar such thatSubstituting
, we getRearranging, we get
Letting
, we haveWe further note that
Taking infimum over
on the L.H.S., we obtainas desired.
(3)
Take any
.Fix an arbitrary
.Consider a sequence
such that .Since
, hence for every .Again applying Theorem 10.30, for each
, there exists a nonvertical hyperplane strongly separating from .Hence for each
, there exists a normal vector such thatRearranging, we have
Thus, we have
Taking the limit on the R.H.S. as
, we can see thatFinally, taking the infimum over
, we can see that
By Observation 10.7,
if minimax equality holds, then
Thus the minimax equality holds.
10.4.5. Minimax Theorems#
10.4.5.1. First Theorem#
Theorem 10.36 (Minimax theorem I)
Let
Then, the minimax equality (10.4) holds; i.e.,
if and only if the function
for every sequence
Proof. The proof consists of establishing the correspondence between the conditions in this result and the conditions in the first min common/max crossing theorem (Theorem 10.33).
We choose the set
as described in Definition 10.19, to be the epigraph of the function .We have shown in Definition 10.19 that
.By hypothesis,
.Hence the first assumption of Theorem 10.33 is satisfied.
Following Lemma 10.2,
is convex.Hence
is also convex.Hence the second assumption of Theorem 10.33 is satisfied.
Finally, the condition
is equivalent to the condition of Theorem 10.33 that for every sequence
of with .holds true.
We have
.Hence
Following Theorem 10.33, this condition holds if and only if
.Since
is closed and convex, hence following Lemma 10.3, this condition holds if and only if minimax equality holds.
10.4.5.2. Second Theorem#
We can adapt the argument of first minimax theorem to include conditions on the lines of the second min common/max crossing theorem Theorem 10.34.
Theorem 10.37 (Minimax theorem II)
Let
and that
and the supremum over
10.4.6. Saddle Point Theorems#
From the first minimax theorem (Theorem 10.36), we can see that minimax equality is satisfied if and only if the function
is lower semicontinuous at .If
is closed, then it will be lower semicontinuous.The proof of Lemma 10.2 shows that
can be written as a partial minimization ofwhere
is given byThe results in Partial Minimization and Closedness can be used to guarantee the closedness of
.These results can also guarantee that the infimum of
over is attained.In particular
will be finite and hence guaranteeing that the optimal value of the minimax problem is finite.
Definition 10.20 (Auxiliary functions for the minimax problem)
Let
For each
For each
The function
The function
Following remarks are in order.
If
is closed and convex for each , then is closed and convex due to Theorem 9.114.If
is closed and convex for each , then is closed and convex due to Theorem 9.114.By definition,
for every . can be proper only if for some .Equivalently, for the optimal minimax value
must hold for
to be proper.Similarly,
for every . can be proper only if for some .This is possible only if
Equivalently, for the optimal maximin value
must hold true.
The set of minimizers of
are the optimal points for the minimax problem: .The set of minimizers of
are the optimal points of the maximin problem: .
10.4.6.1. Minimax Equality and Attainment of Minimax Solution#
In the following results, we provide conditions under which the minimax equality is attained and the the set of optimal solutions for the minimax problem is nonempty.
Theorem 10.38 (Compact sublevel sets of
Assume the following:
is closed and convex for every . is closed and convex for every .The optimal minimax value is less than infinity
The sublevel sets of
are compact.
Then the minimax equality (10.4) holds and
the set of optimal points for the minimax problem
Proof. .
Under the assumptions above, the function
is proper, closed and convex.By definition
for every .Note that
.Since
is proper, hence there exists such that .Hence
is also proper. is also closed and convex.Fix some
.Consider the function
. is closed and convex for every by hypothesis.Hence
is closed and convex for every .Taking the pointwise supremum over
, is closed and convex.
The sets
are the sublevel sets of
which are compact for every by hypothesis.We can easily select a scalar for which the sublevel set of
is also nonempty since is proper.Hence, due to Theorem 9.120, the function
which is a partial minimization of over is proper, closed and convex.Since
is closed, hence it is lower semicontinuous.In particular,
is l.s.c. at .Hence due to Theorem 10.36, the minimax equality holds.
Since
is the set of minimizers of , is proper and closed, and the sublevel sets of are compact, hence is nonempty and compact due to Weierstrass’ theorem (Theorem 8.9).
Theorem 10.39 (Recession and constancy space of
Assume the following:
is closed and convex for every . is closed and convex for every .The optimal minimax value is less than infinity
The recession cone and constancy space of the function
are equal.
Then the minimax equality (10.4) holds and
the set of optimal points for the minimax problem
Theorem 10.40 (
Assume the following:
is closed and convex for every . is closed and convex for every .The optimal minimax value is less than infinity
The function
has the form
where
is a proper, closed, and convex function on and is specified by linear inequalities; i.e,where
are matrices and is a vector.Every common direction of recession of
and is a direction along which is constant.
Then the minimax equality (10.4) holds and
the set of optimal points for the minimax problem
Theorem 10.41 (Quadratic form
Assume the following:
and .The function
has a quadratic formwhere
and are symmetric matrices, is a matrix, and and are vectors. . is a set of the formwhere
are symmetric positive semidefinite matrices, are vectors and are scalars. is closed and convex for every . is closed and convex for every .The optimal minimax value satisfies
Then the minimax equality (10.4) holds and
the set of optimal points for the minimax problem
10.4.6.2. Existence of Saddle Points#
Theorem 10.42 (Compact sublevel sets of
Assume the following:
is closed and convex for every . is closed and convex for every .Assume that either
or
holds true.
The sublevel sets of
and are compact.
Then the set of saddle points of
Proof. First assume that
By applying Theorem 10.38, minimax equality holds and
is nonempty.Hence
is finite.Due to minimax equality:
We reverse the roles of
and and the sign of and apply Theorem 10.38 again to show that is nonempty and compact set.The set of saddle points is a Cartesian product of
and .Since both
and are nonempty and compact, hence is also nonempty and compact.
Now assume that
Then
.we can then reverse the role of
and in the preceding argument.
Theorem 10.43 (Recession cones and constancy spaces of
Assume the following:
is closed and convex for every . is closed and convex for every .Assume that either
or
holds true.
The recession cones and constancy spaces of
and are equal to each other:
Then the set of saddle points of
10.4.6.3. Saddle Point Theorem#
The analysis in this section culminates into the following result.
Theorem 10.44 (Saddle point theorem)
Assume that
and are compact. is compact and there exists a vector and a scalar such that the sublevel setis nonempty and compact.
is compact and there exists a vector and a scalar such that the superlevel setis nonempty and compact.
There exist vectors
and and a scalar such that the setsare nonempty and compact.