17.8. Evaluation#

There are two possible approaches for evaluation of a clustering algorithm.

  1. Full reference

  2. No reference

In full reference evaluation, a ground truth about an ideal clustering of the data is given. E.g., suppose our goal is to cluster a set of facial images of different persons. The ground truth in this case is the name/id of each person associated with each image.

17.8.1. Full Reference Evaluation#

Let Y be a given set of data points. Assume that the ideal/reference clustering of Y is given beforehand.

  1. Assume that the dataset Y can be divided into K clusters.

  2. Let these clusters be named Y1,,YK.

  3. Assume that it is known which point belongs to which cluster.

In general a clustering C of a set Y constructed by a clustering algorithm is a set {C1,,CC} of non-empty disjoint subsets of Y such that their union equals Y. Clearly: |Cc|>0.

The clustering process may make a number of mistakes.

  1. It may identify incorrect number of clusters and C may not be equal to K.

  2. More-over even if K=C, the data points may be placed in wrong clusters.

Ideally, we want K=C and Cc=Yk with a bijective mapping between 1cC and 1kK. In practice, a clustering algorithm estimates the number of clusters C and assigns a label ls, 1sS to each vector ys where 1lsC.
All the labels can be put in a label vector L where L{1,,C}S. The permutation matrix Γ can be easily obtained from L.

Following [85], we will quickly establish the commonly used measures for clustering performance.

  1. We have a reference clustering of vectors in Y given by B={Y1,,YK} which is known to us in advance (either by construction in synthetic experiments or as ground truth with real life data-sets).

  2. The clustering obtained by the algorithm is given by C={C1,,CC}.

For two arbitrary points yi,yjY, there are four possibilities:

  1. they belong to same cluster in both B and C (true positive),

  2. they are in same cluster in B but different cluster in C (false negative)

  3. they are in different clusters in B but in same cluster in C

  4. they are in different clusters in both B and C (true negative).

Consider some cluster YiB and CjC.

  1. The elements common to Yi and Cj are given by YiCj.

  2. We define

    precisionij|YiCj||Cj|.
  3. We define the overall precision for Cj as

    precision(Cj)maxi(precisionij).
  4. We define recallij|YiCj||Yi|.

  5. We define the overall recall for Yi as

    recall(Yi)maxj(recallij).
  6. We define the F score as

    Fij2precisionijrecallijprecisionij+recallij.
  7. We define the overall F-score for Yi as

    F(Yi)maxj(Fij).
  8. We note that cluster Cj for which the maximum is achieved is best matching cluster for Yi.

  9. Finally, we define the overall F-score for the clustering

    F(B,C)1Si=1p|Yi|F(Yi)

where S is the total number of vectors in Y.

  1. We also define a clustering ratio given by the factor

    ηCK.

There are different ways to define clustering error. For the special case where the number of clusters is known in advance, and we ensure that the data-set is divided into exactly those many clusters, it is possible to define subspace clustering error as follows:

clustering error=# of misclassified pointstotal # of points.

The definition is adopted from [32] for comparing the results in this paper with their results. This definition can be used after a proper one-one mapping between original labels and cluster labels assigned by the clustering algorithms has been identified. We can compute this mapping by comparing F-scores.