This repo implements the algorithm Contrastive Multiview coding (CMC), initially implemented here, combined with Momentum Contrast (MoCo). The final model is an image encoder which outputs vector of fixed sized (default 128).
We use the STL-10 dataset to train the encoder. The backbone models used to encode are AlexNet, ResNet-50 and EfficientNet-b0.
We evaluate our encoder via a classification task on the STL-10 dataset by adding a 10-feature-output MLP model at the top of our encoder. We also evaluate the final classification score at different layers of the backbone model.
The encoder is adapted to be used with numerous colorspace having two or more views such as Lab, YDbDr, YPbPr.
- Install conda environment
conda env create -f env.yml
- Download STL-10 dataset
python stl10_input.py
- Check configs in
cfgs/config.yaml
- Train the AlexNet encoder
python train
- Show metric evolutions in tensorboard
tensorboard --logdir exp_local
This code is inspired by