SPICE: Semantic Pseudo-labeling for Image Clustering
The similarity among samples and the discrepancy between clusters are two
crucial aspects of image clustering. However, current deep clustering methods
suffer from the inaccurate estimation of either feature similarity or semantic
discrepancy. In this paper, we present a Semantic Pseudo-labeling-based Image
ClustEring (SPICE) framework, which divides the clustering network into a
feature model for measuring the instance-level similarity and a clustering head
for identifying the cluster-level discrepancy. We design two semantics-aware
pseudo-labeling algorithms, prototype pseudo-labeling, and reliable
pseudo-labeling, which enable accurate and reliable self-supervision over
clustering. Without using any ground-truth label, we optimize the clustering
network in three stages: 1) train the feature model through contrastive
learning to measure the instance similarity, 2) train the clustering head with
the prototype pseudo-labeling algorithm to identify cluster semantics, and 3)
jointly train the feature model and clustering head with the reliable
pseudo-labeling algorithm to improve the clustering performance. Extensive
experimental results demonstrate that SPICE achieves significant improvements
(~10%) over existing methods and establishes the new state-of-the-art
clustering results on six image benchmark datasets in terms of three popular
metrics. Importantly, SPICE significantly reduces the gap between unsupervised
and fully-supervised classification; e.g., there is only a 2% (91.8% vs 93.8%)
accuracy difference on CIFAR-10. Our code has been made publically available at
https://github.com/niuchuangnn/SPICE.