Rethinking the Uniformity Metric in Self-Supervised Learning
This repository encompasses four components of code. Firstly, in the “Distribution Approximation” folder, we visualize an asymptotic equivalence between a uniform spherical distribution and an isotropic Gaussian distribution. Secondly, the “Empirical Study” folder presents an empirical analysis, which includes examinations of dimensional collapse degrees, dimensions, the Feature Baby Constraint, the Feature Cloning Constraint, and the Instance Cloning Constraint. In the “Large Means” folder, we illustrate how large means can lead to severe representation collapse. Lastly, within the “code” folder, we integrate the Wasserstein distance
To illustrate the asymptotic equivalence between a uniform spherical distribution (where
python ./DistributionApproximation/Density1DPlot.py or python ./DistributionApproximation/Density2DPlot.pyThen, we draw figures by running juypter notebook files:
Density1DPlot.ipynb or Density2DPlot.ipynbUsing the estimated distributions, we visualize
We also analyze the joint binning densities and present 2D joint binning densities of
We empirically compared our proposed uniformity metric
To generate data reflecting varying degrees of dimensional collapse, we sample data vectors from an isotropic Gaussian distribution, normalize them to have
python ./EmpiricalStudy/DimensionalCollapseDegrees/AnalysisOnCollapseLevel.py Then we draw figures:
AnalysisOnCollapseLevel.ipynbas visualized:
We also analyze the sensitiveness of dimensions. We generate data points and draw figures by:
python ./EmpiricalStudy/Dimensions/AnalysisOnDimension.py then AnalysisOnDimension.ipynbThe analyses results can be found as follow:
We generate data points and draw figures by:
python ./EmpiricalStudy/FeatureCloningConstraint/AnalysisOnProperty4.py then AnalysisOnProperty4.ipynbThe analyses results can be found as follow:
We generate data points and draw figures by:
python ./EmpiricalStudy/FeatureBabyConstraint/AnalysisOnProperty5.py then AnalysisOnProperty5.ipynbThe analyses results can be found as follow:
We generate data points and draw figures by:
python ./EmpiricalStudy/InstanceCloningConstraint/AnalysisOnProperty3.py then AnalysisOnProperty3.ipynbThe analyses results can be found as follow:
To investigate the influence of the mean on uniformity, we consider
python ./LargeMeans/PlotMean2D.pyas visualized:
It is clear that an excessively large means will cause representations to collapse to a single point, even if the covariance matrix is isotropic.
In this repository, we integrate the our proposed uniformity loss
Train and evaluate on either CIFAR-10 or CIFAR-100 dataset without incorporating our proposed uniformity loss
bash ./code/BYOL/run_vanilla_byol.shTrain and evaluate on either CIFAR-10 or CIFAR-100 dataset without incorporating our proposed uniformity loss
bash ./code/BYOL/run_byol+w2.shTrain and evaluate on either CIFAR-10 or CIFAR-100 dataset without incorporating our proposed uniformity loss
bash ./code/BarlowTwins/run_vanilla_barlowtwins.shTrain and evaluate on either CIFAR-10 or CIFAR-100 dataset without incorporating our proposed uniformity loss
bash ./code/BarlowTwins/run_barlowtwins+w2.shTrain and evaluate on either CIFAR-10 or CIFAR-100 dataset without incorporating our proposed uniformity loss
bash ./code/MoCov2/run_vanilla_moco.shTrain and evaluate on either CIFAR-10 or CIFAR-100 dataset without incorporating our proposed uniformity loss
bash ./code/MoCov2/run_moco+w2.shWe report experimental results in this Table:
| Methods | Proj. | Pred. | CIFAR-10 Acc@1↑ | CIFAR-10 Acc@5↑ | CIFAR-10 |
CIFAR-10 |
CIFAR-10 |
CIFAR-100 Acc@1↑ | CIFAR-100 Acc@5↑ | CIFAR-100 |
CIFAR-100 |
CIFAR-100 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MoCo v2 | 256 | ✘ | 90.65 | 99.81 | 1.06 | -3.75 | 0.51 | 60.27 | 86.29 | 1.07 | -3.60 | 0.46 |
| MoCo v2 + |
256 | ✘ | 90.98 ↑₀.₃₃ | 99.67 | 0.98 ↑₀.₀₈ | -3.82 | 0.53 ↓₀.₀₂ | 61.21 ↑₀.₉₄ | 87.32 | 0.98 ↑₀.₀₉ | -3.81 | 0.52 ↓₀.₀₆ |
| MoCo v2 + |
256 | ✘ | 91.41 ↑₀.₇₆ | 99.68 | 0.33 ↑₀.₇₃ | -3.84 | 0.63 ↓₀.₁₂ | 63.68 ↑₃.₄₁ | 88.48 | 0.28 ↑₀.₇₉ | -3.86 | 0.66 ↓₀.₂₀ |
| BYOL | 256 | 256 | 89.53 | 99.71 | 1.21 | -2.99 | 0.31 | 63.66 | 88.81 | 1.20 | -2.87 | 0.33 |
| BYOL + |
256 | ✘ | 90.09 ↑₀.₅₆ | 99.75 | 1.09 ↑₀.₁₂ | -3.66 | 0.40 ↓₀.₀₉ | 62.68 ↓₀.₉₈ | 88.44 | 1.08 ↑₀.₁₂ | -3.70 | 0.51 ↓₀.₁₈ |
| BYOL + |
256 | 256 | 90.31 ↑₀.₇₈ | 99.77 | 0.38 ↑₀.₈₃ | -3.90 | 0.65 ↓₀.₃₄ | 65.16 ↑₁.₅₀ | 89.25 | 0.36 ↑₀.₈₄ | -3.91 | 0.69 ↓₀.₃₆ |
| BarlowTwins | 256 | ✘ | 91.16 | 99.80 | 0.22 | -3.91 | 0.75 | 68.19 | 90.64 | 0.23 | -3.91 | 0.75 |
| BarlowTwins + |
256 | ✘ | 91.38 ↑₀.₂₂ | 99.77 | 0.21 ↑₀.₀₁ | -3.92 | 0.76 ↓₀.₀₁ | 68.41 ↑₀.₂₂ | 90.99 | 0.22 ↑₀.₀₁ | -3.91 | 0.76 ↓₀.₀₁ |
| BarlowTwins + |
256 | ✘ | 91.43 ↑₀.₂₇ | 99.78 | 0.19 ↑₀.₀₃ | -3.92 | 0.76 ↓₀.₀₁ | 68.47 ↑₀.₂₈ | 90.64 | 0.19 ↑₀.₀₄ | -3.91 | 0.79 ↓₀.₀₄ |
| Zero-CL | 256 | ✘ | 91.35 | 99.74 | 0.15 | -3.94 | 0.70 | 68.50 | 90.97 | 0.15 | -3.93 | 0.75 |
| Zero-CL + |
256 | ✘ | 91.28 ↓₀.₀₇ | 99.74 | 0.15 | -3.94 | 0.72 ↓₀.₀₂ | 68.44 ↓₀.₀₆ | 90.91 | 0.15 | -3.93 | 0.74 ↑₀.₀₁ |
| Zero-CL + |
256 | ✘ | 91.42 ↑₀.₀₇ | 99.82 | 0.14 ↑₀.₀₁ | -3.94 | 0.71 ↓₀.₀₁ | 68.55 ↑₀.₀₅ | 91.02 | 0.14 ↑₀.₀₁ | -3.94 | 0.76 ↓₀.₀₁ |
If our paper assists your research, feel free to give us a star or cite us using:
@inproceedings{Fang2024RethinkingTU,
title={Rethinking the Uniformity Metric in Self-Supervised Learning},
author={Xianghong Fang and Jian Li and Qiang Sun and Benyou Wang},
booktitle={ICLR},
year={2024}
}







