Compressive Sampling Matching Pursuit
Contents
21.4. Compressive Sampling Matching Pursuit#
In this section, we look at an algorithm named CoSaMP (Compressive Sampling Matching Pursuit) developed in [61]. This algorithm follows in the tradition of orthogonal matching pursuit while bringing in other ideas from many recent developments in the field leading to an algorithm which is much more robust, fast and provides much stronger guarantees.
This algorithm has many impressive features:
It can work with a variety of sensing matrices.
It works with minimal number of measurements (c.f. basis pursuit).
It is robust against presence of measurement noise (OMP does not have such a claim).
It provides optimal error guarantees for every signal (sparse signals, compressible signals, completely arbitrary signals).
The algorithm is quite efficient in terms of resource (memory, CPU) requirements.
As we move along in this section, we will understand the algorithm and validate all of these claims. Hopefully through this process we will have a very good understanding of how to evaluate the quality of signal recovery algorithm.
21.4.1. Algorithm#
The algorithm itself is presented in Algorithm 21.4.
(CoSaMP for iterative sparse signal recovery)
Inputs:
Sensing matrix
Measurement vector:
whereSparsity level:
Outputs:
: a -sparse approximation of the signal:
Initialization:
# Initial approximation # Residual # Iteration counter
Algorithm:
If halting criteria is satisfied: break.
. # Form signal proxy # Identify large components # Merge supports # Estimation by least-squares # Prune to obtain next approximation # Update residualGo to step 1.
Let us set out the notation before proceeding further. Most of the things are as usual, with few minor updates.
represents the signal which is to be estimated through the algorithm. is unknown to us within the algorithm. is the dimension of ambient signal space, is the sparsity level of of the approximation of that we are estimating and is the dimension of measurement space (number of measurements ).We note that
itself may not be -sparse.Our algorithm is designed to estimate a
-sparse approximation of . represents the sensing matrix. It is known to the algorithm. represents the measurement vector belonging to . This is known to the algorithm. represents the measurement noise which is unknown to us within the algorithm. represents the iteration (or step) counter within the algorithm. represents our estimate of . is updated iteratively. represents the estimate of at the end of -th iteration.We start with
and update it in each iteration. represents a proxy for . We will explain it shortly. , etc. represent index sets (subsets of ).For any
and any index set , can mean either of the two things:A vector in
consisting of only those entries in which are indexed by .A vector in
whose entries indexed by are same as that of while entries indexed by are set all to zero.(21.19)#
For any
and any integer , means a vector in which consists of the largest (in magnitude) entries of (at the corresponding indices) while rest of the entries in are 0.With an index set
, means an matrix consisting of selected columns of indexed by . represents the difference between the actual measurement vector and estimated measurement vector .Ideal estimate of
would be itself. But that won’t be possible to achieve in general. is the best -sparse approximation of which can be estimated by our algorithm (Definition 18.13).
(Clarifying the notation in CoSaMP)
Let us consider
Then
Also
since
happens to be -sparse.For
, we haveor
in different contexts.
21.4.1.1. The Signal Proxy#
As we have learnt by now that the most challenging part in a signal recovery
algorithm is to identify the support of the
If
is -sparse and satisfies RIP (see Restricted Isometry Property) of order with the restricted isometry constant then it can be argued that can serve as a proxy for the signal .In particular the largest
entries of point towards the largest entries of .Although we don’t know
inside the algorithm, yet we have access to (assuming error to be ), the proxy can be easily estimated by computing .
This is the key idea in CoSaMP.
Rather than identifying just one new index in support of
, it tries to identify whole of support by picking up the largest entries of .It then solves a least squares problem around the columns of
indexed by this support set.It keeps only the
largest entries from the least squares solution as an estimate of .Of course there is no guarantee that in a single attempt the support of
will be recovered completely.Hence the residual between actual measurement vector
and estimated measurement vector is computed and it is used to identify other indices of iteratively.
21.4.1.2. Core of Algorithm#
There are two things which are estimated in each iteration of algorithm
-sparse estimate of at -th step: .Residual at
-th step: .
Residual and proxy:
We start with a trivial estimate
and improve it in each iteration. is nothing but the measurement vector .As explained before
is the difference between actual measurement vector and the estimated measurement vector .This
is used for computing the signal proxy at -th step.Concretely assuming
Thus
is the proxy of the difference between the original signal and the estimated signal.
During each iteration, the algorithm performs following tasks
[Identification] The algorithm forms a proxy of the residual
and locates the largest entries of the proxy.[Support merger] The set of newly identified indices is merged with the set of indices that appear in the current approximation
(i.e. ).[Estimation] The algorithm solves a least squares problem to approximate the signal
on the merged set of indices.[Pruning] The algorithm produces a new approximation
by retaining only the largest entries in the least squares solution.[Residual update] Finally the new residual
between original measurement vector and estimated measurement vector is computed.
The steps are repeated until the halting criteria is reached.
We will discuss the halting criteria in detail later.
A possible halting criteria is when the norm of residual
21.4.2. CoSaMP Analysis Sparse Case#
In this subsection, we will carry out a detailed theoretical analysis of CoSaMP algorithm for sparse signal recovery.
We will make following assumptions in the analysis:
The sparsity level
is fixed and known in advance.The signal
is -sparse (i.e. ).The sensing matrix
satisfies RIP of order with .Measurement error
is arbitrary.
For each iteration, we need to define one more quantity: recovery error
The quantity
Under ideal recovery,
should become as increases.Since we don’t know
within the algorithm, we cannot see directly.We do have following equation
We will show that in each iteration, CoSaMP reduces the recovery error by a constant factor while adding a small multiple of the measurement noise
.As a result, when recovery error is large compared to measurement noise, the algorithm makes substantial progress in each step.
The algorithm stops making progress when recovery error is of the order of measurement noise.
We will consider some iteration
At the beginning of the iteration we have
The quantities of interest are
, and out of which is unknowable.In order to simplify the equations below we will write
Term |
Description |
Details |
---|---|---|
Sensing matrix |
||
Measurement noise |
||
Measurement vector |
||
A |
||
Residual at the beginning of the iteration |
||
Recovery error at the beginning of the iteration |
||
The proxy for |
||
The index set of largest |
21.4.2.1. Identification#
We start our analysis with the identification step in the main loop of CoSaMP (Algorithm 21.4).
We compute
and consider as the index set of largest indices in .We will show that most of the energy in
(the recovery error) is concentrated in the entries indexed by .In other words, we will be looking for a bound of the form
Since
is -sparse and is -sparse hence is -sparse.In first iteration where
, support of is same as support of .In later iterations it may include more indices.
Let
We remind that
is known to us while is unknown to us.Ideal case would be when
.Then we would have recovered whole of support of
; recovering would therefore be easier.Let us examine what are the differences between the two sets.
We have
and .Moreover,
indexes largest entries in .Thus we have (due to Theorem 18.18)
(21.20)#where
are obtained from as per Definition 18.11.Squaring (21.20) on both sides and expanding we get
Now we do hope that some of the entries are common in both sides. Those entries are indexed by
.The remaining entries on L.H.S. are indexed by
and on the R.H.S. are indexed by .Hence we have
(21.21)#For both
and we haveThe worst case happens when both sets are totally disjoint.
Let us first examine the R.H.S and find an upper bound for it.
At this juncture, its worthwhile to scan various results in Restricted Isometry Property since we are going to need many of them in the following steps.
Let us look at the term
.We have
.Further
is sparse and sits over which is disjoint with .Together
.A straightforward application of Corollary 18.5 gives us
Similarly using Theorem 18.57 we get
Combining the two we get
(21.22)#We now look at L.H.S. and find a lower bound for it.
We will split
asFurther we will use a form of triangular inequality as
We expand L.H.S.
As before
We can use the lower bound from Theorem 18.59 to give us
since
.For the other two terms we need to find their upper bounds since they appear in negative sign.
Applying Corollary 18.5 we get
Again using Theorem 18.57 we get
Putting the three bounds together, we get a lower bound for L.H.S.
(21.23)#Finally plugging the upper bound from (21.22) and lower bound from (21.23) into (21.21) we get
This result is nice but there is a small problem. We would like to get rid of
from L.H.S. and get instead.Recall that
where
and are disjoint.Thus
Since
is on (recall that ), henceAs a result
This gives us
(21.24)#Let us simplify this equation by using
assumed at the beginning of the analysis.This gives us
Thus we get
Simplifying
(21.25)#This result tells us that in the absence of measurement noise, at least 78% of energy in the recovery error is indeed concentrated in the entries indexed by
.Moreover, if the measurement noise is small compared to the size of recovery error, then this concentration continues to hold.
Thus even if we don’t know
directly, is a close approximation for .
We summarize the analysis for the identification step in the following lemma.
Let
In other words, most of the energy in
21.4.2.2. Support Merger#
The next step in a CoSaMP iteration is support merger. Recalling from Algorithm 21.4, the step involves computing
i.e. we merge the support of current signal estimate
In previous lemma we were able to show how well the recovery error
is concentrated over the index set
We recall that
and since is a -sparse estimate of hence .Thus
Clearly
Further since
hence .Thus
Since
henceCombining these facts we can write
We summarize this analysis in following lemma.
Let
Note that this inequality says nothing about how
21.4.2.3. Estimation#
The next step is a least square estimate of
We also set
.We have
Since
, is a -sparse approximation of .What we need here is a bound over the approximation error
.We have already obtained a bound on
.If an upper bound on
depends on and measurement noise that would be quite reasonable.We start with splitting
over and .Since
henceThis gives us
We will now expand the term
.Now since
is full column rank (since satisfies RIP of order ) hence .Also
(since column columns indexed by are involved).This helps us cancel
.Thus
Let us look at the terms on R.H.S. one by one.
Let
.Then
This perfectly fits Theorem 18.59 with
giving usFurther, applying Corollary 18.5 we get
since
.For the measurement noise term, applying Theorem 18.58 we get
Combining the above inequalities we get
Recalling our assumption that
we can simplify the constants to get
We summarize the analysis in this step in the following lemma.
Let
If
Note that the lemma doesn’t make any assumptions about
how
21.4.2.4. Pruning#
The last step in the main loop of CoSaMP (Algorithm 21.4) is pruning.
We compute
as the next estimate of
We note that both
and can be regarded as term approximations of .As established in Theorem 18.18,
is the best -term approximation of .Thus
Now
This helps us establish that although the
-sparse approximation is closer to compared to , but is also not too bad approximation of while using only entries at most.
We summarize it in following lemma.
The pruned estimate
21.4.2.5. The CoSaMP Iteration Invariant#
Having analyzed each of the steps in the main loop of CoSaMP algorithm, it is time for us to combine the analysis. Concretely, we wish to establish how much progress CoSaMP makes in each iteration. Following theorem provides an upper bound on how the recovery error changes from one iteration to next iteration. This theorem is known as the iteration invariant for the sparse case.
(CoSaMP iteration invariant for exact sparse recovery)
Assume that
In particular
This theorem helps us establish that if measurement noise is small, then the algorithm makes substantial progress in each iteration. The proof makes use of the lemmas developed above.
Proof. We run the proof in backtracking mode.
We start from
From Lemma 21.5 we have
Applying Lemma 21.4 for the least squares estimation step gives us
Lemma 21.3 (Support merger) tells us that
This gives us
From identification step we have Lemma 21.2
This gives us
The constants have been simplified to make them look better.
For the 2nd result in this theorem, we add up the error at each stage as
At
we have .This gives us the result
21.4.3. CoSaMP Analysis General Case#
Having completed the analysis for the sparse case
(where the signal
We decompose
into its -sparse approximation and the approximation error.Further, we absorb the approximation error term into the measurement error term.
Since sparse case analysis is applicable for an arbitrary measurement error, this approach gives us an upper bound on the performance of CoSaMP over arbitrary signals.
We start with writing
where
is the best -term approximation of (Theorem 18.18).Thus we have
We define
This lets us write
In this formulation, the problem is equivalent to recovering the
-sparse signal from the measurement vector .The results of CoSaMP Analysis Sparse Case and in particular the iteration invariant Theorem 21.4 apply directly.
The remaining problem is to estimate the norm of modified error
.We have
Another result for RIP on the energy bound of embedding of arbitrary signals from Theorem 18.68 gives us
Thus we have an upper bound on
given bySince we have assumed throughout that
, it gives usThis inequality is able to combine measurement error and approximation error into a single expression.
To fix ideas further, we define the notion of unrecoverable energy in CoSaMP.
(Unrecoverable energy)
The unrecoverable energy in CoSaMP algorithm is defined as
This quantity measures the baseline error in the CoSaMP recovery consisting of measurement error and approximation error.
It is obvious that
We summarize this analysis in following lemma.
Let
We now move ahead with the development of the iteration invariant for the general case.
Invoking Theorem 21.4 for the iteration invariant for recovery of a sparse signal gives us
What remains is to generalize this inequality for the arbitrary signal
itself.We can write
We simplify L.H.S. as
On the R.H.S. we expand as
Combining the two we get
Putting the estimate of
from Lemma 21.6 we getNow
Thus we write a simplified expression
where
is the unrecoverable energy (Definition 21.2).
We can summarize the analysis for the general case in the following theorem.
(CoSaMP iteration invariant for general signal recovery)
Assume that
In particular
Proof. 1st result was developed in the development before the theorem. For the 2nd result in this theorem, we add up the error at each stage as
At
21.4.3.1. SNR Analysis#
A sensible though unusual definition of signal-to-noise ratio (as proposed in [61]) is as follows
(21.30)#The whole of unrecoverable energy is treated as noise.
The signal
norm rather than its square is being treated as the measure of its energy. This is the unusual part.Yet the way
has been developed, this definition is quite sensible.Further we define the reconstruction SNR or recovery SNR as the ratio between signal energy and recovery error energy.
(21.31)#Both
and are expressed in dB.Certainly we have
Let us look closely at the iteration invariant
In the initial iterations,
term dominates in the R.H.S.Assuming
we can write
This gives us
In the later iterations, the
term dominates in the R.H.S.Assuming
we can write
This gives us
We combine these two results into the following
(21.32)#This result tells us that in the initial iterations the reconstruction SNR keeps improving by
dB per iteration till it hits the noise floor given by dB.Thus roughly the number of iterations required for converging to the noise floor is given by