9.1. Real Vector Spaces#

We recall that

  • \(\RR\) denotes the real line

  • \(\ERL\) denotes the extended real line

  • \(\RR_+\) denotes the set of nonnegative reals.

  • \(\RR_{++}\) denotes the set of positive reals.

We shall concern ourselves with subsets of real vector spaces. The scalar field associated with real vector spaces is \(\RR\). The vector spaces are denoted by \(\VV\) or \(\EE\).

  • We shall exclusively work with finite dimensional vector spaces.

  • The vector space is endowed with a real inner product whenever required.

  • The vector space is endowed with a norm induced by the inner product whenever required.

Examples of inner product spaces:

  • Euclidean space \(\RR^n\)

  • Space of matrices \(\RR^{m \times n}\)

  • Space of symmetric matrices \(\SS^n\).

\(\VV^*\) shall denote the dual space of a vector space \(\VV\). As discussed in Theorem 4.106, the vector spaces \(\VV\) and \(\VV^*\) are isomorphic. Therefore, we follow the convention that both \(\VV\) and \(\VV^*\) have exactly the same elements. The primary difference between \(\VV\) and \(\VV^*\) comes from the computation of norm. If \(\VV\) is endowed with a norm \(\| \cdot \|\) then \(\VV^*\) is endowed with a dual norm \(\| \cdot \|_*\).

9.1.1. Affine Sets#

Affine sets for a general vector space \(\VV\) over field \(\FF\) have been discussed in Affine Sets and Transformations. We recall the definitions and adapt them for real vector spaces.

For any \(\bx\) and \(\by\) in \(\VV\), points of the form \(t \bx + (1 - t) \by\) where \(t \in \RR\) form a line.

Any subset \(C \subseteq \VV\) is affine if \(C = t C + (1-t)C\) for all \(t \in \RR\). An affine set contains all its lines. Other terms used for affine sets are affine manifolds, affine varieties, linear varieties or flats. Empty set is affine. The whole vector space \(\VV\) is affine. Singletons (sets with a single point) are affine. Any line is affine.

A point of the form \(\bx = t_1 \bx_1 + \dots + t_k \bx_k\) where \(t_1 + \dots + t_k = 1\) with \(t_i \in \RR\) and \(\bx_i \in \VV\), is called an affine combination of the points \(\bx_1,\dots,\bx_k\). An affine set contains all its affine combinations. An affine combination of affine combinations is an affine combination.

Let \(C\) be a nonempty affine set and \(\bx_0\) be any element in \(C\). Then the set

\[ V = C - \bx_0 = \{ \bx - \bx_0 | \bx \in C\} \]

is a linear subspace of \(\VV\). \(C\) can be written as \(C = V + \bx_0\). A nonempty affine set is a translated linear subspace. The linear subspace associated with \(C\) is independent of the choice of \(\bx_0\) in \(C\). A nonempty affine set is called an affine subspace. The affine dimension of an affine subspace is the dimension of the associated linear subspace.

The set of all affine combinations of points in some arbitrary nonempty set \(S \subseteq \VV\) is called the affine hull of \(S\) and denoted as \(\affine S\): An affine hull is an affine subspace. The affine hull of a nonempty set \(S\) is the smallest affine subspace containing \(S\).

A set of vectors \(\bv_0, \bv_1, \dots, \bv_k \in \VV\) is called affine independent, if the vectors \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0\) are linearly independent.

9.1.2. Linear Functionals#

Recall that a linear functional in a real inner product space , denoted by \(f: \VV \to \RR\), corresponding to a vector \(\ba \in \VV^*\) is given by

\[ f(\bx) = \langle \bx, \ba \rangle. \]

Theorem 9.1 (Linear functionals are continuous)

Let \(f: \VV \to \RR\) be a linear functional corresponding to a nonzero vector \(\ba \in \VV^*\) given by

\[ f(\bx) = \langle \bx, \ba \rangle. \]

The linear functional is uniformly continuous.

Proof. Let \(\bx, \by \in \VV\).

\[\begin{split} |f(\bx) - f(\by) | &= |\langle \bx, \ba \rangle - \langle \by, \ba \rangle| \\ &= |\langle \bx - \by, \ba \rangle|\\ &\leq \| \bx - \by \| \| \ba \|_* \end{split}\]

due to generalized Cauchy Schwartz inequality.

  1. Let \(\epsilon > 0\).

  2. Let \(\delta = \frac{\epsilon}{\| \ba \|_*}\). Clearly, \(\delta > 0\).

  3. Assume \(d(\bx, \by) = \| \bx - \by \| < \delta\).

  4. Then

    \[ |f(\bx) - f(\by) | \leq \| \bx - \by \| \| \ba \|_* < \delta \| \ba \|_* = \epsilon. \]
  5. Thus, for any \(\bx, \by \in \VV\), \(\| \bx - \by \| < \delta \implies |f(\bx) - f(\by) | < \epsilon\).

  6. Thus, \(f\) is uniformly continuous.

9.1.3. Hyper Planes#

Hyperplanes for general vector spaces are described in Definition 4.87 in terms of linear functionals. Here, we focus specifically on hyperplanes in a real inner product space.

Definition 9.1 (Hyperplane)

A hyperplane is a set of the form

\[ H_{\ba, b} \triangleq \{ \bx \ST \langle \bx, \ba \rangle = b \} \]

where \(\ba \in \VV^*, \ba \neq \bzero\) and \(b \in \RR\). The vector \(\ba\) is called the normal vector to the hyperplane.

  • Algebraically, it is a solution set of a nontrivial linear equation. Thus, it is an affine set.

  • Geometrically, it is a set of points with a constant inner product to a given vector \(\ba\).

  • The representation of \(H_{\ba, b}\) is unique up to a common nonzero multiple. In other words,

    \[ H_{\ba, b} = H_{\alpha \ba, \alpha b} \Forall \alpha \neq 0. \]
  • Every other normal of \(H_{\ba, b}\) is either a positive or negative multiple of \(\ba\).

  • Thus, we can think of \(H_{\ba, b}\) having two sides, one along the normal \(\ba\) and one opposite to the normal.

Theorem 9.2

A hyperplane is affine.

Proof. Let \(H\) be a hyperplane given by

\[ H = \{ x \ST \langle \bx, \ba \rangle = b \} \]

where \(\ba \in \VV^*, \ba \neq \bzero\) and \(b \in \RR\).

  1. Let \(\bx, \by \in H\) and \(t \in \RR\).

  2. Let \(\bz = t \bx + (1-t) \by\).

  3. Then,

    \[ \langle t \bx + (1-t) \by, \ba \rangle = t \langle \bx, \ba \rangle + (1-t)\langle \by, \ba \rangle = t b + (1-t) b = b. \]
  4. Thus, \(\bz \in H\).

  5. Thus, \(H\) is affine.

Theorem 9.3 (Hyperplane second form)

Let \(\bx_0\) be an arbitrary element in \(H_{\ba, b}\). Then

\[ H_{\ba, b} = \{ \bx \ST \langle \bx-\bx_0, \ba \rangle = 0\}. \]

Proof. Given \(\bx_0 \in H\),

\[\begin{split} &\langle \bx_0, \ba \rangle = b\\ \implies &\langle \bx, \ba \rangle = \langle \bx_0, \ba \rangle \Forall \bx \in H\\ \implies &\langle \bx - \bx_0, \ba \rangle = 0 \Forall \bx \in H\\ \implies &H = \{ \bx \ST \langle \bx-\bx_0, \ba \rangle = 0\}. \end{split}\]

Recall that orthogonal complement of \(\ba\) is defined as

\[ \ba^{\perp} = \{ \bv \in \VV \ST \ba \perp \bv \}; \]

i.e., the set of all vectors that are orthogonal to \(\ba\).

Theorem 9.4 (Hyperplane third form)

Let \(\bx_0\) be an arbitrary element in \(H_{\ba, b}\). Then

\[ H_{\ba, b} = \bx_0 + \ba^{\perp}. \]

Proof. Consider the set

\[ S = \bx_0 + \ba^{\perp}. \]

Every element \(\bx \in S\) can be written as \(\bx = \bx_0 + \bv\) such that \(\langle \bv, \ba \rangle = 0\). Thus,

\[ \langle \bx , \ba \rangle = \langle \bx_0 , \ba \rangle = b. \]

Thus, \(S \subseteq H\).

For any \(\bx \in H\):

\[ \langle \bx - \bx_0, \ba \rangle = b - b = 0. \]

Thus, \(\bx - \bx_0 \in \ba^{\perp}\). Thus, \(\bx \in \bx_0 + \ba^{\perp} = S\). Thus, \(H \subseteq S\).

Combining:

\[ H = S = \bx_0 + \ba^{\perp}. \]

In other words, the hyperplane consists of an offset \(\bx_0\) plus all vectors orthogonal to the (normal) vector \(\ba\).

Observation 9.1

A hyperplane is an affine subspace since \(\ba^{\perp}\) is a linear subspace and \(H\) is the linear subspace plus an offset \(\bx_0\).

9.1.4. Half Spaces#

Definition 9.2 (halfspace)

A hyperplane divides \(\VV\) into two halfspaces. The two (closed) halfspaces are given by

\[ H_+ = \{ \bx : \langle \bx, \ba \rangle \geq b \} \]

and

\[ H_- = \{ \bx : \langle \bx, \ba \rangle \leq b \} \]

The halfspace \(H_+\) extends in the direction of \(\ba\) while \(H_-\) extends in the direction of \(-\ba\).

  • A halfspace is the solution set of one (nontrivial) linear inequality.

  • The halfspace can be written alternatively as

\[\begin{split} H_+ = \{ \bx \ST \langle \bx - \bx_0, \ba \rangle \geq 0\}\\ H_- = \{ \bx \ST \langle \bx - \bx_0, \ba \rangle \leq 0\} \end{split}\]

where \(\bx_0\) is any point in the associated hyperplane \(H\).

  • Geometrically, points in \(H_+\) make an acute angle with \(\ba\) while points in \(H_-\) make an obtuse angle with \(\ba\).

Definition 9.3 (Open halfspace)

The sets given by

\[\begin{split} \Interior{H_+} = \{ \bx | \langle \bx, \ba \rangle > b\}\\ \Interior{H_-} = \{ \bx | \langle \bx, \ba \rangle < b\} \end{split}\]

are called open halfspaces. They are the interior of corresponding closed halfspaces.

Theorem 9.5

A closed half space is a closed set.

Proof. Consider the halfspace

\[ H_+ = \{ \bx : \langle \bx, \ba \rangle \geq b \}. \]

Consider the linear functional \(f : \bx \mapsto \langle \bx, \ba \rangle\). We can see that

\[ H_+ = f^{-1}([b, \infty)). \]
  1. The interval \([b, \infty)\) is a closed interval in \(\RR\).

  2. Recall from Theorem 9.1 that \(f\) is uniformly continuous.

  3. Since \(f\) is continuous hence \(f^{-1}([b, \infty))\) is also closed due to Theorem 3.42.

Similarly, for the half-space

\[ H_- = \{ \bx : \langle \bx, \ba \rangle \leq b \} \]

We can see that

\[ H_- = f^{-1}((-\infty, b]). \]
  1. The interval \((-\infty, b]\) is a closed interval in \(\RR\).

  2. Since \(f\) is continuous hence \(f^{-1}((-\infty, b])\) is also closed due to Theorem 3.42.

Theorem 9.6

An open half space is an open set.

Proof. Consider the halfspace

\[ H_{++} = \{ \bx : \langle \bx, \ba \rangle > b \}. \]

Consider the linear functional \(f : \bx \mapsto \langle \bx, \ba \rangle\). We can see that

\[ H_{++} = f^{-1}((b, \infty)). \]
  1. The interval \((b, \infty)\) is an open interval in \(\RR\).

  2. Since \(f\) is continuous hence \(f^{-1}((b, \infty))\) is also open due to Theorem 3.42.

Similarly, for the half-space

\[ H_{--} = \{ \bx : \langle \bx, \ba \rangle < b \} \]

We can see that

\[ H_{--} = f^{-1}((-\infty, b)). \]
  1. The interval \((-\infty, b)\) is an open interval in \(\RR\).

  2. Since \(f\) is continuous hence \(f^{-1}((-\infty, b))\) is also open due to Theorem 3.42.

9.1.5. The \(\VV \oplus \RR\) Vector Space#

While studying convex cones, we often find dealing with the set \(\VV \times \RR\). Since \(\RR\) is a vector space over \(\RR\) by itself (see Example 4.3), hence we have a direct sum \(\VV \oplus \RR\) vector space.
We provide an extended vector space structure below (providing inner product and norm features) which aligns with the vector space structure of \(\RR^{n+1}\) if \(\VV = \RR^n\).

Definition 9.4 (Direct sum \(\VV \oplus \RR\))

Let \(\VV\) be a real vector space. A vector space structure can be introduced to the set \(\VV \times \RR\) as per the following definitions.

  1. [Additive identity] Let \(\bzero\) be the additive identity for \(\VV\). Then, the additive identity for \(\VV \times \RR\) is given by \((\bzero, 0)\).

  2. [Vector addition] Let \((\bx, s)\) and \((\by, t)\) be in \(\VV \times \RR\). Then, their sum is defined as:

    \[ (\bx, s) + (\by, t) \triangleq (\bx + \by, s + t). \]
  3. [Scalar multiplication] Let \((\bx, s) \in \VV \times \RR\) and \(\alpha \in \RR\). Then, the scalar multiplication is defined as:

    \[ \alpha (\bx, s) \triangleq (\alpha \bx, \alpha s). \]
  4. [Inner product] If \(\VV\) is an inner product space, then with \((\bx, s)\) and \((\by, t)\) be in \(\VV \times \RR\), the inner product is defined as:

    \[ \langle (\bx, s), (\by, t) \rangle \triangleq \langle \bx, \by \rangle + st. \]
  5. [Norm] If \(\VV\) is a normed linear space, then for any \((\bx, s) \in \VV \times \RR\), the norm is defined as:

    \[ \| (\bx, s) \| \triangleq \sqrt{\| \bx \|^2 + s^2}. \]

\(\VV \times \RR\) equipped with these definitions is a vector space over \(\RR\) in its own right and is called the direct sum of \(\VV\) and \(\RR\) denoted by \(\VV \oplus \RR\).

Readers can verify that these definitions satisfy all the properties of real vector spaces, normed linear spaces and inner product spaces.

9.1.6. Norms#

Remark 9.1 (Norms in \(n\)-dim real space)

Let \(\VV\) be an \(n\)-dimensional real inner product space endowed with an inner product \(\langle \cdot, \cdot \rangle : \VV \times \VV \to \RR\).

We have \(n = \dim \VV\).

\(\VV\) is isomorphic to the Euclidean space \(\RR^n\).

We can develop popular norms following the treatment in The Euclidean Space.

Let us introduce an orthonormal basis for \(\VV\) as \(\{\be_1, \dots, \be_n \}\).

For any \(\bx \in \VV\), we shall write its decomposition as

\[ \bx = \sum_{i=1}^n x_i \be_i. \]

The inner product expands to

\[ \langle \bx, \by \rangle = \sum_{i=1}^n x_i y_i \Forall \bx, \by \in \VV. \]

Introduce the norm induced by the inner product as

\[ \| \bx \| = \sqrt{\langle \bx, \bx \rangle} = \sqrt{\sum_{i=1}^n x_i^2}. \]

We shall also call it as \(\ell_2\) norm on \(\VV\).

Introduce the \(\ell_1\) norm as

\[ \| \bx \|_1 = \sum_{i=1}^n | x_i|. \]

Introduce the \(\ell_{\infty}\) norm as

\[ \| \bx \|_{\infty} = \max_{i=1,\dots,n} |x_i|. \]

We can generalize to \(\ell_p\) norms as

\[\begin{split} \| \bx \|_p = \begin{cases} \left ( \sum_{i=1}^{n} | x_i |^p \right ) ^ {\frac{1}{p}} & \text{ if } & p \in [1, \infty)\\ \underset{1 \leq i \leq n}{\max} |x_i| & \text{ if } & p = \infty \end{cases}\, . \end{split}\]

The Hölder's inequality follows. Let \(\bu, \bv \in \VV\). Let \(p \in [1, \infty]\) and let \(q\) be its conjugate exponent. Then

\[ \| \bu \bv \|_1 \leq \| \bu \|_p \| \bv \|_q \]

where \(\bu \bv\) denotes the element-wise multiplication given by:

\[ \bu \bv = (u_1 v_1, \dots, u_n v_n). \]

All norms are equivalent. The bounds between norms are given below.

\[ \frac{1}{\sqrt{n}} \| \bx \|_1 \leq \| \bx \|_2 \leq \| \bx \|_1. \]
\[ \| \bx \|_2 \leq \| \bx \|_1 \leq \sqrt{n} \| \bx \|_2. \]
\[ \frac{1}{\sqrt{n}} \| \bx \|_2 \leq \| \bx \|_{\infty} \leq \| \bx \|_2. \]
\[ \| \bx \|_{\infty} \leq \| \bx \|_2 \leq \sqrt{n} \| \bx \|_{\infty}. \]

The Euclidean distance between two vectors is defined as:

\[ d(\bx,\by) = \| \bx - \by \| = \sqrt{\sum_{i=1}^{n} (x_i - y_i)^2}. \]

Open and closed balls

  • Let \(B(\bx, r)\) and \(B[\bx, r]\) represent the open and closed balls for the inner product induced norm.

  • Let \(B_p(\bx, r)\) and \(B_p[\bx, r]\) represent the open and closed balls for the \(\ell_p\) norm.

  • Let \(B_1(\bx, r)\) and \(B_1[\bx, r]\) represent the open and closed balls for the \(\ell_1\) norm.

  • Let \(B_2(\bx, r)\) and \(B_2[\bx, r]\) represent the open and closed balls for the \(\ell_2\) norm which is same as the inner product induced norm.

  • Let \(B_{\infty}(\bx, r)\) and \(B_{\infty}[\bx, r]\) represent the open and closed balls for the \(\ell_{\infty}\) norm.

We have the following containment relationships for the closed balls for different norms.

\[ B_1[\bx, r] \subseteq B_2[\bx, r] \subseteq B_{\infty}[\bx, r]. \]
\[ B_{\infty}[\bx, r] \subseteq B_2[\bx, r \sqrt{n}] \subseteq B_1[\bx, r n]. \]

These relationships are derived from the norm inequalities above. Similar relationships are applicable for open balls too.