Subgaussian Distributions
Contents
7.9. Subgaussian Distributions#
In this section we review subgaussian distributions and matrices drawn from subgaussian distributions.
Examples of subgaussian distributions include
Gaussian distribution
Rademacher distribution taking values
Any zero mean distribution with a bounded support
A random variable
holds for all
The definition means that for a subgaussian variable
(Gaussian r.v. as subgaussian r.v.)
Consider zero-mean Gaussian random variable
Putting
(Rademacher distribution)
Consider
i.e.
Then
Thus
(Uniform distribution)
Consider
Then
But
Thus
Hence
(Random variable with bounded support)
Consider
for some
Then, the following upper bound holds:
This result can be proven with some advanced calculus.
There are some useful properties of subgaussian random variables.
(Mean and variance of subgaussian random variables)
If
and
Thus subgaussian random variables are always zero-mean. Their variance is always bounded by the variance of the bounding Gaussian distribution.
Proof. We proceed as follows:
Note that
But since
henceRestating
Dividing throughout by
and letting we get .Dividing throughout by
and letting we get .Thus
. So .Now we are left with
Dividing throughout by
and letting we get .
Subgaussian variables have a linear structure.
(Linearity of subgaussian variables)
If
If
Proof. Scalar multiplication:
Let
be -subgaussian.Then
Now for
, we haveHence
is -subgaussian.
Addition:
Consider
as -subgaussian and as -subgaussian.Thus
Let
be two numbers s.t. .Using H”older’s inequality, we have
Since this is valid for any
, we can minimize the r.h.s. over .If suffices to minimize the term
We have
Equating it to 0 gives us
Taking second derivative, we can verify that this is indeed a minimum value.
Thus
Hence we have the result
Thus
is -subgaussian.
If
If
(Subgaussian moment)
For a centered random variable
We can also show that
For centered Gaussian r.v.
Sometimes it is useful to consider more restrictive class of subgaussian random variables.
(Strictly subgaussian distribution)
A random variable
holds true for all
We will denote strictly subgaussian variables by
(Gaussian distribution)
If
7.9.1. Characterization#
We quickly review Markov’s inequality which will help us establish the results in this subsection.
Let
For a centered random variable
moment generating function condition:
Subgaussian tail estimate: There exists
such that -condition: There exists some such that
Proof.
Using Markov’s inequality, for any
we haveSince this is valid for all
, hence it should be valid for the minimum value of r.h.s.The minimum value is obtained for
.Thus we get
Since
is -subgaussian, hence is also -subgaussian.Hence
Thus
Thus we can choose
to complete the proof.
TODO PROVE THIS
TODO PROVE THIS
7.9.2. More Properties#
We also have the following result on the exponential moment of a subgaussian random variable.
Suppose
for any
Proof. We are given that
Multiplying on both sides with
Integrating on both sides w.r.t.
which reduces to:
which completes the proof.
7.9.3. Subgaussian Random Vectors#
The linearity property of subgaussian r.v.s can be extended to random vectors also. This is stated more formally in following result.
Suppose that
Norm of a subgaussian random vector
Let
be a random vector where each is i.i.d. with .Consider the
norm . It is a random variable in its own right.It would be useful to understand the average behavior of the norm.
Suppose
. Then .Also
. Thus .It looks like
should be connected with .Norm can increase or decrease compared to the average value.
A ratio based measure between actual value and average value would be useful.
What is the probability that the norm increases beyond a given factor?
What is the probability that the norm reduces beyond a given factor?
These bounds are stated formally in the following theorem.
Suppose that
Moreover, for any
and
First equation gives the average value of the square of the norm.
Second inequality states the upper bound on the probability that norm could reduce beyond a factor given by
.Third inequality states the upper bound on the probability that norm could increase beyond a factor given by
.Note that if
are strictly subgaussian, then . Hence .
Proof. Since
This proves the first part.
Now let us look at (7.5).
By applying Markov’s inequality for any
Since
Thus:
Putting it back we get:
Since above is valid for all
Thus we get optimum
Plugging this back we get:
Similarly proceeding for (7.4) we get
We need to simplify these equations. We will do some jugglery now. Consider the function
By differentiating twice, we can show that this is a strictly increasing function.
Let us have
Clearly
Which gives us:
Hence by exponentiating on both sides we get:
By slight manipulation we get:
We now choose
Substituting we get:
Finally
Thus we get
Similarly by choosing
We can now map
This result tells us that given a vector with entries drawn from a subgaussian
distribution, we can expect the norm of the vector to concentrate around its
expected value