Mathematical Optimization
Contents
8.1. Mathematical Optimization#
Main references for this section are [9, 17].
This section provides a general overview of optimization problems. The notion of objective or cost functions and constraint functions is introduced. Standard form for optimization problems is introduced. Several examples are discussed for equivalent forms of optimization problems. The standard form focuses on minimizing the objective function over a set of equality and inequality constraints. Maximization problems can also be converted into minimization problems by changing the sign of the objective function.
We introduce the notion of the domain of an optimization problem, feasible points or solutions, feasible set of solutions, the optimal value of the minimization problem, optimal points or solutions, optimal set etc.. We discuss when the problem may be infeasible or feasible but unbounded. We discuss situations where the problem may be feasible and have a global minimum/optimal value and yet a feasible solution may not exist.
We then discuss general requirements for the feasibility of the optimization problem and existence of an optimal solution. We introduce the notion of coercive functions which provide the Weierstrass type result for the existence of optimal solutions.
8.1.1. Optimization Problems#
Let 
In the most general form, an optimization problem can
be considered as minimizing or maximizing the value
of a real valued function 
- We call - The variable - A point - Our goal is to find an 
Definition 8.1 (Globally optimal value and points)
Let 
- The maximum value of - The minimum value of - We allow the maximum and minimum values to take - The function - A point - The maximum and minimum values of - The set of global minimizers is given by 
- The set of global maximizers is given by 
In a more restricted setting, an optimization problem can
be considered as minimizing or maximizing the value
of a real valued function 
- In this case, the feasible set is restricted to - Minimizers and maximizers are searched within this subset - This problem can be converted into the earlier general form by considering the restriction 
In several problems, it may not be possible to establish whether a point is globally optimal. However, it is easier to establish if a point is optimal in a neighborhood around it. Such points are called locally optimal points or extreme values.
Definition 8.2 (Local optimal points)
Let 
More specifically,
The point 
Here 
Remark 8.1 (Local optimal points for open domain)
Let 
- If - But since - Let - Then, - Thus, - Thus, - Thus, the local optimality condition simplifies to looking for an open ball of radius 
Based on this discussion, we can adjust the conditions for local optimality.
8.1.1.1. Standard Form for Mathematical Optimization Problems#
Definition 8.3 (Optimization problem standard form)
Let 
is known as an optimization problem in its standard form.
- The functions - The corresponding inequalities - The functions - The corresponding equations - If there are no constraints ( - The set - is called the domain of the optimization problem. 
- A point - The optimization problem is called feasible if there exists at least one feasible point. 
- If there are no feasible points in the domain - The set of feasible points for an optimization problem is called the feasible set or the constraint set. We shall denote the feasible set by - It is the intersection of the domain of - Thus, if the problem is infeasible, then - The optimum value - In other words, - We allow - If the problem is infeasible, then - If - We say that - The set of all optimal points is known as the optimal set denoted by - In other words, 
- If an optimal point exists in - If - In particular, if the problem is unbounded below, then - If the feasible set - A feasible point - The set of all - We say that a feasible point - In other words, - If - We say that a constraint is redundant if removing it does not change the feasible set. 
- If we choose an orthonormal basis - Thus, determining - Since there are 
In the sequel, we will be presenting a variety of optimization problems. We shall show how those problems can be converted to the standard form described above.
Example 8.1 (Box constraints standard form)
Let 
where 
This problem can be transformed into an equivalent problem:
This form has 
Then, the problem becomes
This is the optimization problem in standard form with 
Example 8.2 (Maximization problem in standard form)
Consider the problem
If we replace 
Definition 8.4 (Feasibility problem)
Let 
is called a feasibility problem.
We can convert this problem into the standard problem by introducing a cost function which is identically 0:
- The domain reduces to 
- The optimal value - Otherwise, the optimal value is - Every feasible point is also an optimal point. 
- Every optimization problem in standard form can be converted into a feasibility problem by replacing the objective function with the function which is identically 0 everywhere. 
- In that case, the problem reduces to checking whether the inequality and equality constraints are consistent or not. In other words, whether there are some points which satisfy all the inequality and equality constraints. 
8.1.2. Equivalent Problems#
Definition 8.5 (Equivalent optimization problems)
We provide an informal definition of equivalent optimization problems.
Two optimization problems are called equivalent if it is possible to find the solution of one problem from the other problem and the vice versa.
The primary reason for transforming an optimization problem to an equivalent one is that it may be easier to solve an equivalent problem. If we can transform a problem into a form which has a well known closed form solution or has a well known algorithm to solve the problem, then we can solve the original problem by first solving the equivalent problem and then transforming the solution of the equivalent problem to the solution of the original problem.
- Transform the problem to an equivalent problem with a known algorithm for solution. 
- Solve the equivalent problem. 
- Transform the solution of the equivalent problem to the solution of the original problem. 
In this sense, the computational complexity of solving an optimization problem cannot be greater than the complexity of solving the equivalent problem plus the complexity of transforming the solution of the equivalent problem to the solution of the original problem.
There are several ways to obtain an equivalent optimization problem from an existing optimization problem.
8.1.2.1. Scaling#
Proposition 8.1 (Equivalent problems by scaling of objective or constraint functions)
Consider the problem given by
where 
This problem is obtained from the optimization problem in standard from in (8.1) by scaling the objective function and the inequality constraint functions by positive constants and the equality constraints by nonzero constants.
- The feasible sets for both problems are identical. 
- The optimal sets for both problems are identical. 
- The optimal values for both problems are not identical however (unless 
8.1.2.2. Change of Variables#
8.1.2.3. Transformation with Monotone Functions#
Proposition 8.3 (Transformation of objective and constraint functions)
For the problem (8.1), we introduce the following:
- Let - Let - Let - Let - Let 
Now, consider the problem
with the variable 
Example 8.3 (Least norms and least norms squared)
Consider the unconstrained least norm problem
with 
Then, 
Then, the least norm minimization problem is equivalent to the problem
which is the least norm squared problem.
We note that while 
8.1.2.4. Slack Variables#
Proposition 8.4 (Slack variables)
For the inequality constraints in the problem (8.1),
we can introduce the variables 
where the optimization variables are 
The variables 
- This form has - It has - If - If - The slack variables measure how much slack does an inequality constraint function have at a feasible point. 
- If - If 
8.1.2.5. Epigraph Polar Form#
Proposition 8.5 (Epigraph polar form)
The problem (8.1) is equivalent to the optimization problem below:
with the optimization variable 
- Note that - At the optimal point, - If - Similarly, if - The two problems are indeed equivalent. 
- This problem has - The objective function for the epigraph form is a linear function of 
8.1.3. First Order Conditions#
In this subsection, we focus on objective functions of type
Theorem 8.1 (First order optimality criterion for local optimal points)
Let 
Then, 
- Let - Consider the one-dimensional function - where - We note that - Since - Thus, - Thus, - Since, this is true for every 
We mention that gradient being zero is a necessary condition;
i.e., if a point is an optimal point then the gradient of 
Definition 8.6 (Stationary point)
Let 
Thus, the locally optimal points are necessarily stationary points.
8.1.4. Second Order Conditions#
Recall the discussion on semidefinite and definite matrices in Eigen Values.
For a twice continuously differentiable function 
We have following necessary conditions for local optimality.
- If - If 
We have following sufficient conditions for local optimality.
- If - If 
Theorem 8.2 (Necessary second order optimality conditions)
Let 
Then, the following hold.
- If - If 
Proof. Assume 
- Then, there exists an open ball - Let - For - By definition, - Hence, for any - By linear approximation theorem (Theorem 5.5), there exists a vector - Since - Also, by definition of - Thus, 
- Since - Thus, for any - By the continuity of the Hessian, and the fact that - Thus, 
The argument for second statement is similar. We can apply the
same argument on 
Theorem 8.3 (Sufficient second order optimality conditions)
Let 
Then, the following hold.
- If - If 
Proof. Let 
- Since the Hessian is continuous ( - By linear approximation theorem (Theorem 5.5), for any - We note that - Thus, - Thus, - Thus, 
The proof for the second statement is similar.
Theorem 8.4 (Hessian based sufficient global optimality condition)
Let 
In other words, if the Hessian of a function is continuous and always positive semidefinite, then all its stationary points are also global minimum points.
Proof. We are given that 
- Let - By linear approximation theorem (Theorem 5.5), there exists a vector - By hypothesis, - Thus, - Thus, 
This result is a special case for the global optimality of convex optimization problems. If a convex function is twice differentiable, then its Hessian is positive semidefinite Theorem 9.100.
8.1.4.1. Saddle Points#
Definition 8.7 (Saddle point)
Let 
A stationary point 
Theorem 8.5 (Sufficient condition for saddle points)
Let 
If 
Proof. We are given that 
- Then, - Recall from Theorem 4.133 that a matrix is indefinite if and only if at least one eigenvalue is positive and at least one eigenvalue is negative. 
- Let - Since - Accordingly, - By quadratic approximation theorem (Theorem 5.6), 
- Putting - Here - Thus, - For every - Choose - Then, - Thus, - Thus, there exists - Thus, - A similar argument with a negative eigenvalue and corresponding normalized eigenvector shows that - Thus, 
8.1.5. Minimization of Proper Functions#
Let 
- The set - Any - If there is at least one feasible point (i.e., - Otherwise, the problem is infeasible. 
- Let - We allow - If the minimization problem is infeasible, then - If - If there is some - Alternatively, - If - It is possible that - In other words, the set 
A basic question of optimization is whether an optimal solution exists.
Remark 8.2 (The set of optimal points)
The set of optimal points for the problem of minimization of a proper function
where 
This comes directly from the fact that
In other words, the optimal set is the intersection of the feasible set
Thus, for an optimal solution to exist
- The level set - The feasible set - The intersection of 
8.1.5.1. Coercive Functions#
Definition 8.8 (Coercive function)
A proper function 
If 
One way to think about coercive functions is that they grow rapidly at the extremes of the domain on which they are defined.
Theorem 8.6 (Level sets of coercive functions)
Let 
In other words, all nonempty level sets of a coercive function are bounded.
Proof. For contradiction, assume that there exists 
- Then, there exists a sequence - Since - But, by definition, - Hence - We have a contradiction. 
- Hence, 
Theorem 8.7 (Sublevel sets of coercive functions)
Let 
In other words, all nonempty sublevel sets of a coercive function are bounded.
Proof. For contradiction, assume that there exists 
- Then, there exists a sequence - Since - But, by definition, - Hence - We have a contradiction. 
- Hence, 
8.1.5.2. Weierstrass’ Theorem#
In this subsection, we examine the problem of unconstrained minimization of a proper closed function. Recall from Definition 3.64 that a function is called closed if all its sublevel sets are closed.
Theorem 8.8 (Unconstrained minimization of a proper closed function)
Let 
Let 
In words, the minimal value 
Proof. We show that 
- Since - Thus, there exists - Hence, 
Let 
We first show that 
- let - Then, - Then, - Hence, - Thus, - Thus, 
We now show that 
- Let - We claim that - Thus, - Now, for contradiction, assume that - Let - Let - Then, - Hence, - But then, - A contradiction. 
 
- Thus, the only allowed value for - Thus, - Thus, - Thus, 
We show that 
- If - Otherwise, - Since - Hence - Thus, - Thus, 
Theorem 8.9 (Weierstrass’ theorem)
Let 
Let 
Assume that one of the following conditions are true.
- There exists a scalar 
Then, 
Here is this result in a simplified language. The theorem addresses the problem of unconstrained minimization of a function.
- If - If - If 
Since the set of minimizers is nonempty and compact, it is nonempty, closed and bounded.
Proof. If there exists 
Assume that condition (1) holds.
- Since - Consider a sequence - Since - Let - Such a sequence exists since - Thus, for every 
 
- Since - Since - Since - Also, note that since - Since - Since - Thus, - Since - Thus, - Thus, the set - Since 
Assume that condition (2) holds.
- The sublevel set - Since - Consider the restriction of - Then, - Then, - Since - For any - For any 
 
- Thus, - Applying condition (1) on - We also note that the minimizers of - Thus, the set of minimizers of 
Assume that condition (3) holds.
- Since - Let - By Theorem 8.7, the nonempty sublevel sets of - Then, by applying condition (2), the set of minimizers of 
Corollary 8.1 (Minimizing a proper closed function over a closed set)
Let 
Further, assume that one of the following conditions are true.
- There exists a scalar 
Then, the set of minimizers of 
Proof. We define a restriction 
- Since - Since - Hence, - Also, note that the set - Since - Hence, all the sublevel sets of - The set of minimizers of - If 
Thus, applying Theorem 8.9,
the set of minimizers of 
Corollary 8.2 (Minimizing a real valued l.s.c. function over a closed set)
Let 
Further, assume that one of the following conditions are true.
- There exists a scalar 
Then, the set of minimizers of 
Proof. We define a restriction 
- Since - Since - Hence, - Since - Also, note that the set - The set of minimizers of - If 
Thus, applying Theorem 8.9,
the set of minimizers of 
8.1.6. Some Simple Problems and Technical Results#
In this subsection, we provide some technical results on some simple optimization problems. These results are useful building blocks for bigger problems.
Theorem 8.10 (Nonnegativity of affine functional on nonnegative orthant)
Consider the affine function 
Then, 
Proof. Assume that 
For the converse, assume that 
- If - Consider some - Assume that - Let - Then, - By increasing - Thus, 
Theorem 8.11 (Minimization of linear functional over unit ball)
For any 
is 
Proof. If 
Now consider the case where 
- By Cauchy Schwartz inequality - for any - Thus, we have established a lower bound: 
- We now show that the lower bound is indeed attained. 
- Let - Then - Now