Convex Optimization
Contents
10.1. Convex Optimization#
Main references for this section are [6, 9, 17].
10.1.1. Convex Optimization Problems#
(Convex optimization problem general form)
Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(f : \VV \to \RR\) be a convex function with \(S = \dom f\). Let \(C \subseteq S \subseteq \VV\) be a convex set.
A mathematical optimization problem of the form
is known as a convex optimization problem.
In other words, a convex optimization problem minimizes a convex function over a convex set. We can write the optimal value of the minimization problem as
The set of optimal points for the minimization problem is
Note
\(f\) is the objective function (or cost function) being minimized.
\(\bx\) is the optimization variable.
\(C\) is the feasible set. Any \(\bx \in C\) is a feasible point.
The problem is feasible if \(C \neq \EmptySet\).
If \(C = \EmptySet\), then the problem is infeasible.
If the problem is infeasible then \(p^* = \infty\).
If the problem is feasible then \(p^* < \infty\).
If \(p^* = -\infty\), then the problem is unbounded below.
We say that \(\bx^*\) is an optimal point for the minimization problem if \(f(\bx^*) = p^*\).
If \(X_{\text{opt}}\) is nonempty, then we say that the optimal value \(p^*\) is attained at some point.
It is possible that \(p^*\) is finite and yet the optimal set is empty.
(Closedness of the feasible set)
Some authors prefer to define convex optimization as the minimization of a convex function over a closed and convex set.
We have not insisted that \(C\) be a closed set in the general definition of convex optimization problem.
The closedness of \(C\) is an additional property which can provide additional guarantees on the existence of an optimal point. We shall introduce it whenever needed in the discussion below.
Recall from Theorem 9.170 that a closed and convex set \(C\) is an intersection of all the halfspaces that contain it. Let \(\{ A_i \}_{i \in I}\) be the set of halfspaces that contains \(C\).
Then, each half space can be written as
\[ A_i = \{ \bx \in \VV \ST \langle \bx, \ba_i \rangle \leq b_i \}. \]Thus, \(\bx \in C\) is equivalent to \(\langle \bx, \ba_i \rangle \leq b_i\) for every \(i \in I\).
This gives us the motivation to consider the feasible set in the form of intersection of sublevel sets of convex functions.
Majority of convex optimization problems can be transformed into a functional form where the constraints are expressed in the form of sublevel sets of convex functions and the level sets of affine functions.
10.1.1.1. Convex Optimization Standard Form#
(Convex optimization problem standard form)
Let \(\VV\) be an \(n\)-dimensional real vector space. A mathematical optimization problem of the form
with optimization variable \(\bx \in \VV\) is called a convex optimization problem in standard form if
The objective function \(f_0: \VV \to \RR\) is a convex function.
The inequality constraint functions \(f_i: \VV \to \RR\) are convex functions for \(i=1,\dots,m\).
The equality constraint functions \(h_j: \VV \to \RR\) are affine functions for \(j=1,\dots,p\).
(Discussion on the convex optimization standard form)
The domain of the problem is given by
\[ \DDD = \dom f_0 \cap \bigcap_{i=1}^m \dom f_i \cap \bigcap_{j=1}^p \dom h_j. \]\(\dom h_j = \VV\) for every \(j=1,\dots,p\) since \(h_j\) are affine functions.
Thus,
\[ \DDD = \dom f_0 \cap \bigcap_{i=1}^m \dom f_i. \]By definition \(\dom f_i\) are convex for \(i=0,\dots,m\). Hence \(\DDD\) is convex.
The feasible set \(C\) is given by
\[ C = \dom f_0 \cap \bigcap_{i=1}^m f_i^{-1}(-\infty, 0] \cap \bigcap_{j=1}^p h_j^{-1}(0). \]Then, \(f_i^{-1}(-\infty, 0]\) are sublevel sets of convex functions hence they are also convex sets.
By Theorem 4.209, the level sets of affine functions are affine sets.
Thus, \(h_j^{-1}(0)\) is an affine set. Hence, it is a convex set.
Thus, \(C\) is an intersection of convex sets.
Thus, \(C\) is a convex set.
We note that we can rewrite the standard form as
\[\begin{split} & \text{minimize } & & f_0(\bx) \\ & \text{subject to } & & \bx \in C \end{split}\]to match with Definition 10.1.
(Adding closedness of objective and constraint functions)
Suppose we add an additional requirement that the function \(f_i\) for \(i=0,\dots,m\) are closed.
Recall from Definition 3.64 that a function is closed if all its sublevel sets are closed.
In particular, this means that the domain of a closed function is also closed.
Thus, \(\DDD\) is a closed set.
Then, \(f_i^{-1}(-\infty, 0]\) are closed sets since \(f_i\) are closed functions.
Since \(\VV\) is finite dimensional, hence affine sets are closed. Hence \(h_j^{-1}(0)\) is closed for every \(j=1,\dots,p\).
Thus, \(C\) is an intersection of closed and convex sets.
Thus, \(C\) is a closed and convex set.
Recall from Theorem 3.92 that a function is closed if and only if it has a closed epigraph.
Also, recall from Theorem 3.119 that a function is closed if and only if it is l.s.c. (lower semicontinuous).
Further, recall from Theorem 9.172 that every convex function has a closure which is l.s.c. (hence closed) given by its lower semicontinuous hull and the closure of a convex function is also convex.
Thus, if any of the \(f_i\) for \(i=0,\dots,m\) is convex but not a closed function, we can replace it by its lower semicontinuous hull to convert it to a closed convex function. This way, be can bring such problems to the standard form for convex optimization.
(Comparison of two forms)
We have seen two forms of describing convex optimization problems.
Definition 10.1 describes it as minimizing a convex function over a convex set.
Definition 10.2 describes it in the form of minimizing a convex objective function with convex inequality constraints and affine equality constraints.
As seen in comments above, the second form can be converted into the first form.
The first form is more general. It is very useful in establishing the theoretical properties of convex optimization problems.
The second form is more useful from practical perspective. Almost all real life convex optimization problems can be reduced to this form. It is easier to describe algorithms to solve convex optimization problems in terms of this form.
In the sequel, we will liberally use either form for proving theoretical results and developing algorithms.
(Concave function maximization problem)
Let \(\VV\) be an \(n\)-dimensional real vector space. A mathematical optimization problem of the form
with optimization variable \(\bx \in \VV\) is called a concave maximization problem if
The objective function \(f_0: \VV \to \RR\) is a concave function.
The inequality constraint functions \(f_i: \VV \to \RR\) are convex functions for \(i=1,\dots,m\).
The equality constraint functions \(h_j: \VV \to \RR\) are affine functions for \(j=1,\dots,p\).
(Concave maximization as a convex optimization problem)
We note that maximizing \(f_0\) is same as minimizing \(-f_0\). Further, if \(f_0\) is concave then \(-f_0\) is convex. Thus, (10.2) is equivalent to the problem
which is a convex optimization problem in standard form. With an abuse of notation, we shall call a concave function maximization program also a convex optimization program.
10.1.1.2. Proper Convex Functions#
It is useful to extend these definitions to proper convex functions as objective functions and/or inequality constraint functions.
(Optimization problem general form for proper convex functions)
Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(f : \VV \to \RERL\) be a proper convex function with \(S = \dom f\). Let \(X \subseteq \VV\) be a convex set.
A mathematical optimization problem of the form
is known as a convex optimization problem for proper convex functions.
In other words, this problem minimizes a proper convex function over a convex set. We can write the optimal value of the minimization problem as
The set of feasible points is given by \(C = X \cap \dom f\) which is a convex set.
The set of optimal points for the minimization problem is
(Convex optimization problem standard form for proper convex functions)
Let \(\VV\) be an \(n\)-dimensional real vector space. A mathematical optimization problem of the form
with optimization variable \(\bx \in \VV\) is called a convex optimization problem in standard form for proper convex functions if
The objective function \(f_0: \VV \to \RERL\) is a proper convex function.
The inequality constraint functions \(f_i: \VV \to \RERL\) are proper convex functions for \(i=1,\dots,m\).
The equality constraint functions \(h_j: \VV \to \RR\) are affine functions for \(j=1,\dots,p\).
The reader can check that the definitions of problem domain, feasible set, optimal value, optimal solutions/points, etc. can be easily extended by using the effective domains of \(f_0, \dots, f_m\).
10.1.1.3. Local and Global Optima#
(Local minimum is a global minimum in convex optimization)
Let \(f : \VV \to \RR\) be a convex function. Let \(C \subseteq \dom f\) be a convex set. Consider the problem of minimizing \(f\) over \(C\). Let \(\bx^*\) be locally optimal for \(f\) over \(C\). Then, \(\bx^*\) is globally optimal for \(f\) over \(C\).
In other words,
Proof. Since \(\bx^*\) is a local minimum of \(f\) over \(C\), hence there exists \(r > 0\) such that
In other words, \(f(\bx) \geq f(\bx^*)\) for every \(\bx \in B[\bx, r] \cap C\).
Let \(\by \in C\) be any point such that \(\by \neq \bx^*\).
We need to show that \(f(\by) \geq f(\bx^*)\).
Let \(t \in (0,1]\) be such that \(\bz = \bx^* + t(\by - \bx^*) \in B[\bx, r]\).
Since \(C\) is convex, hence \(\bz \in C\) as \(\bx^*, \by \in C\) and \(\bz\) is their convex combination.
Thus, \(\bz \in B[\bx, r] \cap C\).
By the local optimality condition
\[\begin{split} f(\bx^*) &\leq f(\bz) \\ &= f(\bx^* + t(\by - \bx^*)) \\ &= f( (1 - t) \bx^* + t \by) \\ &\leq (1 -t )f(\bx^*) + t f(\by). \end{split}\]We used the fact that \(f\) is convex.
Cancelling and rearranging the terms, we get \(t f(\bx^*) \leq t f(\by)\).
Since \(t > 0\), hence it reduces to \(f(\bx^*) \leq f(\by)\).
Thus, \(\bx^*\) is indeed globally optimal.
(Local minimum is a global minimum in for proper convex functions)
Let \(f : \VV \to \RERL\) be a proper convex function. Let \(X\) be a convex subset of \(\VV\). Consider the problem of minimizing \(f\) over \(X\). Let \(\bx^*\) be locally optimal for \(f\) over \(X\). Then, \(\bx^*\) is globally optimal for \(f\) over \(X\).
In other words,
Proof. We note that the feasible set is \(C = X \cap \dom f\). Since both \(X\) and \(\dom f\) are convex, hence \(C\) is convex.
If \(C = \EmptySet\), then there are no feasible points. Hence, no local/global optima.
We now consider the case where \(C \neq \EmptySet\).
Following the argument in Theorem 10.1, if \(\bx^*\) is locally optimal for \(f\) over \(C\), then it is globally optimal for \(f\) over \(C\).
Consequently, it is globally optimal for \(f\) over \(X\) as \(f(\bx) = \infty > f(\bx^*)\) for every \(\bx \in X \setminus \dom f\).
The argument can be modified to show that if \(f\) is strictly convex, then a locally optimal point for \(f\) is strictly globally optimal point.
(Local minimum is strict global minimum for strictly convex functions)
Let \(f : \VV \to \RR\) be a strictly convex function. Let \(C \subseteq \dom f\) be a convex set. Let \(\bx^*\) be locally optimal for \(f\) over \(C\). Then, \(\bx^*\) is strictly globally optimal for \(f\) over \(C\).
In other words,
Proof. There exists \(r > 0\) such that \(f(\bx) \geq f(\bx^*)\) for every \(\bx \in B[\bx, r] \cap C\).
Let \(\by \in C\) be any point such that \(\by \neq \bx^*\).
Let \(t \in (0,1)\) be such that \(\bz = \bx^* + t(\by - \bx^*) \in B[\bx, r]\).
Since \(C\) is convex, hence \(\bz \in C\) as \(\bx^*, \by \in C\) and \(\bz\) is their convex combination.
Thus, \(\bz \in B[\bx, r] \cap C\).
Note that \(\bz = (1-t) \bx^* + t \by\). Thus, \(\bz\) is distinct from \(\bx^*\) and \(\by\) and lies in the line segment between them.
By the local optimality condition
\[\begin{split} f(\bx^*) &\leq f(\bz) \\ &= f( (1 - t) \bx^* + t \by) \\ &< (1 -t )f(\bx^*) + t f(\by). \end{split}\]We used the fact that \(f\) is strictly convex.
Cancelling and rearranging the terms, we get \(t f(\bx^*) < t f(\by)\).
Since \(t > 0\), hence it reduces to \(f(\bx^*) < f(\by)\).
Thus, \(\bx^*\) is indeed strictly globally optimal.
(Local minimum is strict global minimum for proper strictly convex functions)
Let \(f : \VV \to \RERL\) be a proper strictly convex function. Let \(X\) be a convex subset of \(\VV\). Let \(\bx^*\) be locally optimal for \(f\) over \(X\). Then, \(\bx^*\) is strictly globally optimal for \(f\) over \(X\).
In other words,
10.1.1.4. Optimal Sets#
The optimal sets of a convex optimization problem are also convex.
(Optimal set is convex for a convex optimization problem)
Let \(f : \VV \to \RR\) be a convex function. Let \(C \subseteq \dom f\) be a convex set. Let the optimal value for the minimization of \(f\) over \(C\) be given by
Let the optimal set (the set of optimal points) be given by
Then, \(X_{\text{opt}}\) is a convex set.
Proof. If \(X_{\text{opt}}\) is empty, then it is convex trivially. Now consider the case where \(X_{\text{opt}}\) is nonempty.
Let \(\bx, \by \in X_{\text{opt}}\).
Then, \(\bx, \by \in C\).
Let \(t \in [0, 1]\).
Let \(\bz = t \bx + (1- t) \by\).
Since \(C\) is convex, hence \(\bz \in C\). Thus, \(\bz \in \dom f\).
Since \(f\) is convex, hence
\[ f(\bz) \leq t f(\bx) + (1-t)f(\by) = t p^* + (1-t)p^* = p^*. \]But \(p^* = \inf \{ f(\bx) \ST \bx \in C \}\).
Thus, \(f(\bz) \geq p^*\).
Combining, the two inequalities, we get \(p^* = f(\bz)\).
Thus, \(\bz \in X_{\text{opt}}\).
Thus, for every \(\bx, \by \in X_{\text{opt}}\) and every \(t \in [0,1]\), \(\bz = t \bx + (1-t) \by \in X_{\text{opt}}\).
Thus, \( X_{\text{opt}}\) is indeed convex.
(Optimal points for minimization of strictly convex functions)
Let \(f : \VV \to \RR\) be a strictly convex function. Let \(C \subseteq \dom f\) be a convex set. Then, the minimization problem
has at most one optimal point.
Proof. Let the optimal value for the minimization of \(f\) over \(C\) be given by
Let the optimal set (the set of optimal points) be given by
By Theorem 10.1, \(X_{\text{opt}}\) is a convex set.
If \(X_{\text{opt}}\) is empty or a singleton, there is nothing more to prove.
For contradiction, assume that there are two distinct points \(\bx, \by \in X_{\text{opt}}\).
We have \(p^* = f(\bx) = f(\by)\).
Let \(\bz = \frac{1}{2} \bx + \frac{1}{2} \by\).
Thus, it is a convex combination of \(\bx\) and \(\by\).
By convexity of \(X_{\text{opt}}\), \(\bz \in X_{\text{opt}}\). Thus, \(f(\bz) = p^*\).
By strict convexity of \(f\)
\[ f(\bz) < \frac{1}{2} f(\bx) + \frac{1}{2} f(\by) = p^*. \]This contradicts the fact that \(p^*\) is the optimal value for the minimization problem.
Thus, \(X_{\text{opt}}\) must be either empty or a singleton.
Thus, the minimization problem has at most one optimal point.
10.1.2. Concave Objective Functions#
(Minimization of concave function and relative interior)
Let \(\VV\) be an \(n\)-dimensional real normed linear space. Let \(f : \VV \to \RR\) be a concave function with \(S = \dom f\). Let \(C \subseteq S\) be a convex set. Consider the problem of minimizing \(f\) over \(C\). Let the optimal value be given by
Let the set of optimal points for the minimization problem be given by
If \(X_{\text{opt}}\) contains a relative interior point of \(C\), then \(f\) must be constant over \(C\); i.e.,
Proof. Assume that \(\bx \in \relint C \cap X_{\text{opt}}\).
Let \(\by \in C\) be any vector distinct from \(\bx\).
Since \(\bx \in \relint C\), hence due to Theorem 9.144, there exists \(s > 1\) such that \(\bz = \bx + (s -1) (\bx - \by) \in C\). \(\bz\) is a point behind \(\bx\) on the line from \(\by\) to \(\bx\) which belongs to \(C\).
Letting \(t = \frac{1}{s}\) we can rewrite it as
\[ \bx = t \bz + (1-t) \by. \]Thus, \(\bx\) is a convex combination of \(\bz\) and \(\by\).
By concavity of \(f\), we have
\[ f(\bx) \geq t f(\bz) + (1-t) f(\by). \]Since \(\bx\) is an optimal point over \(C\) and \(\bz, \by \in C\), hence \(f(\bx) \leq f(\bz)\) and \(f(\bx) \leq f(\by)\).
Thus,
\[ f(\bx) \geq t f(\bz) + (1-t) f(\by) \geq t f(\bx) + (1-t) f(\bx) = f(\bx). \]This can be true only if \(f(\bx) = t f(\bz) + (1-t) f(\by)\) which in turn implies that \(f(\bx) = f(\bz) = f(\by)\).
Thus, \(f\) must be constant over \(C\).
(Linear functions attain minimum only at boundary points)
Let \(\VV\) be an \(n\)-dimensional real normed linear space. Let \(f : \VV \to \RR\) be a nonzero linear functional. Let \(C \subseteq \VV\) be a convex set. Then, \(f\) cannot attain a minimum at any relative interior point of \(C\).
In other words, \(f\) can attain a minimum only at the relative boundary of \(C\) (if it does attain a minimum over \(C\)).
Proof. We note that every linear functional is also concave.
Let \(\bx \in \relint C\).
Then, there exists \(r > 0\) such that \(B(\bx, r) \cap \affine C \subseteq C\).
Since \(f\) is linear, hence \(f\) cannot be constant over \(B(\bx, r) \cap \affine C\).
Thus, by Theorem 10.5, it cannot attain a minimum at \(\bx\).