9.12. Separation Theorems#

Separation theorems provide us ways to find a separating hyperplane between two disjoint convex sets. Hyperplanes require the notion of an inner product.

Throughout this section, we assume that V is a finite dimensional real vector space endowed with norm :VR and a real inner product ,:V×VR. The norm in general is not necessarily induced by the inner product. The Euclidean norm is denoted by 2.

9.12.1. Types of Separating Hyperplanes#

Definition 9.59 (Separating hyperplane)

Let V be a real n-dimensional vector space. For any two subsets C and D of V, a hyperplane H is said to separate C and D if C is contained in one of the closed halfspaces corresponding to H and D is contained in the other closed halfspace.

Let H be given by

H={xV|x,a=b}

where aV and bR.

If C and D are separated by H, then either

x1,abx2,ax1C,x2D

or

x2,abx1,ax1C,x2D

holds true.

We can also say that the two sets C and D can be separated by a hyperplane or there exists a hyperplane separating C and D if there exists a nonzero vector aV such that

supxCx,ainfxDx,a.

Any choice of b[l,u] where l=supxCx,a and u=infxDx,a describes a separating hyperplane.

This definition allows for some degenerate possibilities where both CH and DH since the closed halfspaces contain H (as their boundary).

Definition 9.60 (Proper separation)

Let V be a real n-dimensional vector space. For any two subsets C and D of V, a hyperplane H is said to properly separate C and D if H separates them and both are not entirely contained in H; i.e., either CH or DH or both.

Proper separation still allows for the possibility that parts of C or D lies inside H. But it ensures that CD. CDH is the common part.

If C and D are properly separated by H, then they are also separated by H. Hence

x1,abx2,ax1C,x2D

or

x2,abx1,ax1C,x2D

holds true. Also, there exists at least one x1C and x2D such that at least one of these inequalities is a strict inequality thus ensuring that either C or D or both are not contained entirely in H.

We can also say that the two sets C and D can be properly separated by a hyperplane or there exists a hyperplane properly separating C and D if there exists a nonzero vector aV such that

supxCx,ainfxDx,a.

and

infxCx,a<supxDx,a.

Any choice of b[l,u] where l=supxCx,a and u=infxDx,a describes a properly separating hyperplane. This characterization is proved in Theorem 9.161 below.

If a convex set has a nonempty interior, then it cannot be contained in a hyperplane. Two disjoint sets one of which has a nonempty interior can indeed be properly separated.

Definition 9.61 (Strong separation)

Let V be a real n-dimensional vector space. For any two subsets C and D of V, a hyperplane H is said to strongly separate C and D if there exists an r>0 such that C+rB is contained in one of the open halfspaces corresponding to H and D+rB is contained in the other open halfspace.

Under these assumptions, one set lies in the interior of H++ and the other set lies in the interior of H. Under strong separation, CD= is guaranteed since H++H=.

We can also say that the two sets C and D can be strongly separated by a hyperplane or there exists a hyperplane strongly separating C and D if there exists a nonzero vector aV such that

supxCx,a<infxDx,a.

Any choice of b(l,u) where l=supxCx,a and u=infxDx,a describes a strongly separating hyperplane.

Strong separation is called as strict separation in [9].

Remark 9.9

Separation is translation invariant. For any aV and convex sets C,DV,

C and D are separated by a hyperplane H if and only if Ca, Da are separated by the hyperplane Ha.

9.12.2. Disjoint Affine and Convex Sets#

Separation between sets can be discussed at several levels. We start with a simple case where an affine set and a relatively open convex set are disjoint.

Theorem 9.160 (Disjoint affine and relatively open convex sets)

Let V be a real n-dimensional vector space. Let C be a nonempty relatively open convex set of V. Let MV be an affine subspace such that MC=. Then, there exists a hyperplane H such that MH and one of the two open halfspaces associated with H contains C.

Proof. Consider the case where M is a hyperplane. Then, H=M and the result is immediate.

  1. If for contradiction, C is not contained in one of the open halfspaces of M.

  2. Then, CH since H is the boundary of the halfspaces.

  3. It will contradict our hypothesis that CM=.

  4. Thus, C must be in one of the open halfspaces of H.

Now, assume that M is not a hyperplane.

Without loss of generality, assume that 0M. We can do this by picking any element m of M and replacing M by Mm and C by Cm. Thus, M is a subspace.

The proof is constructive. We iteratively construct subspaces of increasing dimension which contain M but don’t contain C till we reach a hyperplane.

  1. Let dimM=d where d<n.

  2. Let S=M be the initial subspace.

  3. Let r=n1d be the number of iterations (if M is a hyperplane, no iterations are needed).

  4. For i=1,,r:

    1. Replace S by a subspace S such that dimS=dimS+1 and SC=.

    2. Let S=S

  5. We have arrived with a hyperplane S such that SC=.

Towards this, let us look at the procedure to construct S from S such that dimS=dimS+1 and SC=. Here is the outline

  1. Let S be the orthogonal complement of S.

  2. Pick any 2D subspace P of S.

  3. Pick a line L in P which passes through origin but doesn’t intersect with C; i.e. LC=.

  4. Construct the subspace S=SL. The direct sum is valid since SL={0}.

  5. Note that dimS=dimS+1 and SC= since SC= as well as LC=.

We now elaborate the procedure for finding the line L:

  1. We have S as a linear subspace with dimS<n1 and SC=.

  2. dimS=ndimS>n(n1)=1.

  3. Since dimS>1, it contains a two dimensional subspace P.

  4. Consider the set C=P(CS).

  5. The set CS has some properties.

    1. CS is convex since it is the sum of two convex sets.

    2. Since 0S, hence C=C0CS.

    3. Since CS=, hence 0CS.

    4. Note that due to Theorem 9.157:

      ri(CS)=riCriS=CS

      since C is relatively open and S is affine, hence relatively open.

    5. Thus, CS is relatively open.

  6. Consequently, the set C also has some properties.

    1. C is convex since it is the intersection of two convex sets.

    2. Since 0CS, hence, 0C=P(CS).

    3. C is relatively open.

      1. If C is empty, then C is relatively open vacuously.

      2. Otherwise, by Theorem 9.155:

      riC=ri(P(CS))=Pri(CS)=P(CS).

      Thus, riC=C implies that C is relatively open.

  7. We now seek a line LP passing through the origin 0 that doesn’t intersect C.

  8. We note that LC=LC=.

    1. Suppose for contradiction that LC= but LC.

    2. Let xLC.

    3. Then, xLP.

    4. Also, xCCS.

    5. Thus, xC=P(CS).

    6. This means that xLC.

    7. In other words, LC which contradicts our assumption.

  9. How to choose L? There are three possibilities:

    1. C is empty.

    2. C is contained in a line; i.e., dimaffC=1.

    3. affC=P and dimaffC=2.

    affC cannot be larger than P since CP and P is affine.

  10. If C is empty, then any line in P passing through origin will do.

  11. If affC is a line.

    1. If affC passes through the origin, we can take a line perpendicular to affC passing through origin. In this case, LaffC={0}. Since 0C, hence LC=.

    2. If affC is a line that doesn’t pass through 0, we can simply take a line that is parallel to affC and passes through 0. In this case LaffC=. Consequently, LC=.

  12. If affC is the entire 2D subspace P.

    1. Consider the set K={tC|t>0}.

    2. CK (specifically, for t=0).

    3. K is a convex cone containing C but not containing 0.

    4. K is relatively open since C is relatively open.

    5. A boundary ray of K doesn’t intersect with C since K is relatively open.

    6. Take L to be any of the two boundary rays of K and extend it to a line passing through 0.

9.12.3. Proper Separation Characterization#

Theorem 9.161 (Characterization of proper separation)

Let V be an n-dimensional real vector space. Let S and T be nonempty subsets of V. There exists a hyperplane H that separates S and T properly if and only if there exists a vector aV such that

inf{x,a|xS}sup{x,a|xT}

and

sup{x,a|xS}>inf{x,a|xT}.

Proof. Assume that there is some vector aV satisfying the conditions above.

  1. Since a satisfies the second strict inequality, hence a0. Otherwise, the second inequality cannot hold.

  2. Choose any bR such that

    infxS{x,a}bsupxS{x,a}.
  3. Then, the set H={x|x,a=b} is a hyperplane.

  4. Let H+={x|x,ab} and H={x|x,ab}.

  5. Clearly, SH+ and TH.

  6. The second strict inequality ensures that both S and T cannot be contained in H.

    1. For contradiction, assume SH and TH

    2. Then sup{x,a|xS}=b and inf{x,a|xT}=b.

    3. Thus, the second inequality will not hold.

  7. Thus, S and T are properly separated.

Now, assume that S and T are properly separated by some hyperplane H.

  1. Let H={x|x,a=b} be the specification of the said hyperplane.

  2. Let H+={x|x,ab} and H={x|x,ab}.

  3. Without loss of generality, assume that SH+ and TH+.

  4. Then, x,ab for every xS.

  5. Thus, inf{x,a|xS}b.

  6. Similarly, x,ab for every xT.

  7. Thus, sup{x,a|xT}b.

  8. Thus, the first inequality is satisfied.

  9. Since H properly separates S and T, hence either SH or TH.

  10. If SH, then there exists xS such that x,a>b.

  11. Consequently, sup{x,a|xS}>b.

  12. If TH, then there exists xT such that x,a<b.

  13. Consequently, inf{x,a|xT}<b.

  14. Combining,

    sup{x,a|xS}>inf{x,a|xT}

    must be true.

9.12.3.1. Proper Separation : Set and Point#

Remark 9.10 (Proper separation between a set and a point)

Let V be an n-dimensional real vector space. We consider the case of proper separation between a set S and a point p. We can form a set T={p}. Then the proper separation between S and p can be described in terms of proper separation between S and T.

Then

sup{x,a|xT}=inf{x,a|xT}=p,a.

Hence, S and p are properly separated if and only if there exists a vector aV such that

inf{x,a|xS}p,a and sup{x,a|xS}>p,a.

We now consider a hyperplane

H={xV|x,a=p,a}.

And let H+ denote one of its closed half-spaces

H+={xV|x,ap,a}.

We can clearly see that

  1. aH.

  2. SH+. S is contained entirely in the closed half-space H+.

  3. SH. S is not contained entirely in H.

Thus, if a set and a point are properly separated, then there exists a hyperplane that passes through the point, contains the set in one of its half-spaces but does not fully contain the set itself.

9.12.4. Separating Hyperplane Theorems#

Theorem 9.162 (Separating hyperplane theorem I)

Let V be an n-dimensional real vector space. Let S and T be nonempty convex subsets of V. There exists a hyperplane H that separates S and T properly if and only if riSriT=.

In other words, two sets are properly separated if and only if their relative interiors are disjoint.

Proof. Let C=ST.

  1. By Theorem 9.142, riS and riT are nonempty since S and T are nonempty convex sets.

  2. By Theorem 9.157, riC=riSriT and it is nonempty.

  3. Note that, 0riC if and only if riSriT=.

    1. If 0riC then there exists xriSriT such that xx=0.

Now assume that riSriT=.

  1. Thus, 0riC.

  2. Consider the affine set M={0}.

  3. Clearly, MriC= and riC is relatively open.

  4. By Theorem 9.160, there exists a hyperplane containing M such that riC is a subset of one of its associated open halfspaces.

  5. Let H be this hyperplane given by H={x|x,a=0}. Since the hyperplane contains M, hence it is a subspace.

  6. By Theorem 9.149, CclCclriC.

  7. Thus, C is contained in the corresponding closed halfspace.

  8. Without loss of generality, assume that C is contained in the nonnegative halfspace H+.

  9. Thus, inf{x,a|xC}0.

  10. This means that

    inf{x,a|xC}=inf{x,a|xS}sup{x,a|xT}0.
  11. Thus,

    inf{x,a|xS}sup{x,a|xT}.
  12. Since riCH++, there exists xC, such that x,a>0.

  13. Thus, sup{x,a|xC}>0.

  14. But then,

    sup{x,a|xC}=sup{x,a|xS}inf{x,a|xT}>0.
  15. Thus,

    sup{x,a|xS}>inf{x,a|xT}.
  16. Then, by Theorem 9.161, S and T are properly separated.

Now, for the converse, assume that S and T are properly separated. Thus, there exists a vector aV such that

inf{x,a|xS}sup{x,a|xT}

and

sup{x,a|xS}>inf{x,a|xT}.
  1. From the first inequality, we get that inf{x,a|xC}0.

  2. From the second inequality, we get that sup{x,a|xC}>0.

  3. Thus, there exists a hyperplane H given by H={x|x,a=0} such that the corresponding nonnegative closed halfspace H+={x|x,a0} contains C.

  4. Note that

    intH+=riH+=H++={x|x,a>0}.
  5. Since sup{x,a|xC}>0, hence H++C.

  6. CclH++=H+ but CclH++riH++=H+H++=H.

  7. Thus, by Theorem 9.156, riCriH++=H++.

  8. Thus, 0riC.

  9. Thus, riSriT=.

We are done.

Corollary 9.10 (Separating hyperplane theorem II)

Let V be an n-dimensional real vector space. Let S and T be nonempty disjoint convex subsets of V; i.e., ST=.

Then, there exists a hyperplane that properly separates them.

Proof. Since ST=, hence riSriT=. Then, applying Theorem 9.162, there exists a hyperplane that properly separates S and T.

This result is stronger than proposition 2.4.2 in [9].

Disjointness of convex sets is not enough for strong separation as their closures might meet.

Theorem 9.163 (Separating hyperplane theorem III)

Let V be an n-dimensional real vector space. Let S be a nonempty convex subset of V. Let aV be a vector. There exists a hyperplane H that separates S and a properly if and only if ariS.

In other words, a point can be properly separated from a convex set if and only if it doesn’t belong to its relative interior.

Proof. .

  1. We form a set T={a}.

  2. Then riT={a} (Theorem 9.133).

  3. By Theorem 9.162, S and T are properly separated if and only if

    riSriT=.
  4. In other words,

    riS{a}=.
  5. In other words, ariS.

9.12.5. Strong Separation Characterization#

Theorem 9.164 (Characterization of strong separation)

Let V be an n-dimensional real vector space. Let S and T be nonempty convex subsets of V. There exists a hyperplane H that separates S and T strongly if and only if

inf{xy|xS,yT}>0.

In other words, H strongly separates S and T if and only if 0cl(ST).

Proof. Let C=ST. Then

inf{xy|xS,yT}=inf{v|vC}.

By Theorem 4.51, inf{v|vC}=0 if and only if 0clC.

Thus, inf{v|vC}>0 if and only if 0clC.

Assume that S and T are strongly separated. Let H be the hyperplane that separates S and T. Let H++ be the positive open halfspace. Let H be the negative open halfspace.

  1. Then, there exists an r>0 such that S+rBH++ and T+rBH.

  2. Since H++H=, hence (S+rB)(T+rB)=.

  3. Then, for any xS and yT and every u,vB,

    x+ru(y+rv)>0

    as x+ru=y+rv would mean that (S+rB)(T+rB)

  4. Note that, x+ru(y+rv)=(xy)r(vu).

  5. Then, xy2r must be true for every xS and yT.

    1. For contradiction, assume that xy<2r for some xS and yT.

    2. Let u=12r(yx).

    3. Let v=12r(xy).

    4. Note that u<1 and v<1.

    5. Thus, u,vB.

    6. Then, r(vu)=xy.

    7. Then, (xy)r(vu)=0=0.

    8. This contradicts the condition that x+ru(y+rv)>0 for every xS, yT and u,vB.

  6. Thus, inf{xy|xS,yT}>0.

Conversely, assume that inf{xy|xS,yT}>0.

  1. Let inf{xy|xS,yT}=2r where r>0.

  2. Then, xy2rxS,yT.

  3. Then, (S+rB)(T+rB)=.

    1. For contradiction, assume that (S+rB)(T+rB).

    2. Let a(S+rB)(T+rB).

    3. Then, there exists xS and uB such that a=x+ru.

    4. And, there exists yT and vB such that a=y+rv.

    5. Then, x+ru=y+rv.

    6. Thus, xy=r(vu).

    7. Thus,

      xy=rvur(v+u)<r(1+1)=2r.
    8. This contradictions our hypothesis that xy2rxS,yT.

  4. Note that S+rB and T+rB are convex and disjoint.

  5. Then, due to Corollary 9.10, there exists a hyperplane H which separates S+rB and T+rB properly.

  6. We can choose H so that S+rBH+ and T+rBH.

  7. But then, S+r2BH++ and T+r2BH lie in the opposite open halfspaces.

  8. Thus, S and T are strongly separated.

9.12.6. Strong Separation Conditions#

We describe several conditions which are sufficient to achieve strong separation between two sets. See also strict separation theorem (proposition 2.4.3) of [9].

Theorem 9.165 (Strong separation: closed subtraction)

Let V be an n-dimensional real vector space. Let S and T be two disjoint nonempty convex subsets of V. There exists a hyperplane H that separates S and T strongly if ST is closed.

Proof. We proceed as follows.

  1. Since S and T are disjoint hence 0ST.

  2. Since ST is closed, hence 0cl(ST)=ST.

  3. By Theorem 9.164, S and T are strongly separated.

Theorem 9.166 (Strong separation: closed and compact sets)

Let V be an n-dimensional real vector space. Let S and T be two disjoint nonempty convex subsets of V. There exists a hyperplane H that separates S and T strongly if S is closed and T is compact.

Proof. We proceed as follows.

  1. Since S and T are disjoint hence 0ST.

  2. Since ST is closed, hence 0cl(ST)=ST.

  3. By Theorem 9.164, S and T are strongly separated.

9.12.6.1. Recession Cones#

The conditions below are based on the theory of recession cones and lineality spaces of convex sets. See Recession Cones for a background.

Theorem 9.167 (Strong separation: recession cones)

Let V be an n-dimensional real vector space. Let S and T be two disjoint nonempty closed and convex subsets of V. There exists a hyperplane H that separates S and T strongly if

RSRT=LSLT

where RX denotes the recession cone and LX denotes the lineality space of a set X.

Proof. Due to Corollary 9.18, ST is closed. Then due to Theorem 9.165, S and T are strongly separated.

9.12.6.2. Strongly Separating Hyperplane#

Recall that the (orthogonal) projection of a vector v on a convex set C is the vector xC which is nearest to v under the (Euclidean) norm. In particular, if vC then v is its own projection on C. The projection of the vector 0 on a convex set C is this the vector of C with the minimum norm. See Projection on Convex Sets for details.

Theorem 9.168 (Strongly separating hyperplane)

Let V be an n-dimensional real vector space. Let S and T be two disjoint nonempty and convex subsets of V. Assume that ST is closed.

Consider the vector of minimum norm (projection of the origin) in ST given by v=st where sS and tT.

  1. Let a=12v=12(st).

  2. Let b=12(s+t).

  3. Let c=b,a.

Then the hyperplane H given by

H={xV|x,a=c}

strongly separates S and T. In other words,

x1,a>c>x2,ax1S,x2T.

Proof. .

  1. Since S and T are disjoint hence st0.

  2. Hence a0.

  3. s is the nearest point in clS from clT.

  4. t is the nearest point in clT from clS.

  5. The line segment [s,t] connects these nearest points.

  6. b lies on this line segment.

  7. Accordingly, s is projection of b in clS.

  8. Similarly t is projection of b in clT.

  9. Then, due to Theorem 10.7 (orthogonal projection characterization), for every xS

    xs,bs0.
  10. But

    bs=12(s+t)s=12(ts)=a.
  11. Hence xs,a0.

  12. Hence x,as,a.

  13. Further

    s,a=sb+b,a=a+b,a=a22+b,a=a22+c>c.

    since a0.

  14. Hence for every xS, we have x,a>c.

  15. A similar argument shows that for every xT, we have x,a<c.

9.12.7. Closed Convex Sets#

Theorem 9.169 (Strict separation theorem)

Let V be a real n-dimensional vector space. Let CV be a nonempty closed convex set. Let yC. Then, there exists pV and αR such that

y,p>α and x,pαxC.

In other words, there exists a separating hyperplane such that C is contained in one of its (closed) halfspaces and y is not in that halfspace (i.e., it is in the opposite open halfspace).

By choosing any β(α,y,p), we also have

y,p>β and x,p<βxC.

Proof. This is an application of strong separation.

  1. Define a set D={y}.

  2. Since C is closed and yC, hence y is not a closure point of C.

  3. Thus, there exists r>0 such that B(y,r)C=.

  4. Thus, d(C,D)r.

  5. Then, by Theorem 9.164, C and D are strongly separated.

  6. Let H be a hyperplane which strongly separates them.

  7. Then one of the closed halfspaces of H contains C but not y.

  8. Let H be described by

    H={xV|x,p=α}.
  9. We can negate p,α if necessary so that

    CH={xV|x,pα}.
  10. Accordingly, y,p>α.

Theorem 9.170 (Closed Convex = Intersection of Halfspaces)

Let V be a real n-dimensional vector space. A closed convex set C of V is the intersection of all the closed halfspaces that contain it.

Proof. The set V is closed and convex but no halfspace contains it. So the statement is vacuously true.

The empty set is closed and convex and every halfspace contains it. The intersection of all halfspaces
is the empty set. Thus, the statement is vacuously true.

Let us assume that C is nonempty and not equal to V.

  1. Let aV such that aC.

  2. Since C is closed, hence a is not a closure point of C.

  3. Thus, there exists r>0 such that B(a,r)C=.

  4. Thus, ax>0xC. Specifically, inf{ax|xC}r.

  5. Thus, by Theorem 9.164, {a} and C are strongly separated by some hyperplane H.

  6. Thus, one of the corresponding closed halfspaces contains C but not a.

  7. In other words, for every aC, there exists a closed halfspace that doesn’t contain a but contains C.

  8. Thus, the intersection of all halfspaces containing C cannot contain any element from VC.

    1. If the intersection contained an element aC, then there would be a closed halfspace containing C but not a which would mean that the intersection of all halfspaces containing C cannot contain a.

  9. Hence, the intersection of all closed halfspaces containing C is exactly equal to C.

We now have two different characterizations of closed convex sets.

  1. From Theorem 9.122, a closed and convex set is the closure of the union of all the line segments connecting the points of the set.

  2. From Theorem 9.170, a closed and convex set is the intersection of all the closed half-spaces containing it.

9.12.8. Supporting Hyperplanes#

Definition 9.62 (Supporting hyperplane and halfspaces)

Let V be a real n-dimensional vector space. Let SV. Let x0bdS be a point on its boundary.

If there exists a nonzero vector aV such that x,ax0,a for every xS, then the hyperplane H given by

H={x|x,a=x0,a}

is called a supporting hyperplane of S at x0.

H separates {x0} and S and H contains x0. The halfspace {x|(xx0),a0} corresponding to H is called a supporting halfspace of S at x0.

Convex sets have this beautiful property that there exists a supporting hyperplane at every point in the boundary of the set.

Theorem 9.171 (Supporting hyperplane theorem)

Let V be a real n-dimensional vector space. Let C be a nonempty convex subset of V. Let xbdC be any point in the boundary of C. Then, there exists a supporting hyperplane to C at x.

Proof. Consider the case where intC=.

  1. Then, due to Theorem 9.125, dimaffC<n.

  2. Thus, there exists a hyperplane H such that affCH.

  3. Since V is finite dimensional, hence H is closed (Theorem 4.200).

  4. Thus, clCH.

  5. Thus, xH.

  6. This H trivially separates {x} and C as both are contained in H.

Now, assume that intC.

  1. dimaffC=n.

  2. riC=intC.

  3. Since xbdC, hence the sets {x} and intC are disjoint.

  4. ri{x}={x} (Theorem 9.133).

  5. We have ri{x}riC=.

  6. By Theorem 9.162, there exists a hyperplane H that separates {x} and C properly.

  7. Consequently C lies entirely in one of the closed halfspaces of H.

Corollary 9.11

Let V be a real n-dimensional vector space. A closed convex set C of V is the intersection of all the supporting halfspaces that contain it.