Self-supervised learning for Images
No. | Model Name | Title | Links | Pub. | Organization | Release Time |
---|---|---|---|---|---|---|
1 | iGPT | Generative Pretraining from Pixels | paper code | ICML 2021 | OpenAI | 17 June 2020 |
2 | MST | MST: Masked Self-Supervised Transformer for Visual Representation | paper | NeurIPS 2021 | Chinese Academy of Sciences | 10 June 2021 |
3 | BEiT | BEiT: BERT Pre-Training of Image Transformers | paper code | ICLR 2022 | Microsoft Research | 15 June 2021 |
4 | MAE | Masked Autoencoders Are Scalable Vision Learners | paper code | CVPR 2022 | Meta | 19 Dec 2021 |
5 | iBoT | iBOT: Image BERT Pre-Training with Online Tokenizer | paper code | ICLR 2022 | ByteDance | 15 Nov 2021 |
6 | SimMIM | SimMIM: A Simple Framework for Masked Image Modeling | paper code | arXiv | MSRA | 18 Nov 2021 |
7 | PeCo | PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers | paper | arXiv | Univeristy of Science and Technology of China | 24 Nov 2021 |
8 | MaskFeat | Masked Feature Prediction for Self-Supervised Visual Pre-Training | paper | arXiv | Meta | 16 Dec 2021 |
9 | SplitMask | Are Large-scale Datasets Necessary for Self-Supervised Pre-training? | paper | arXiv | Meta | 20 Dec 2021 |
10 | ADIOS | Adversarial Masking for Self-Supervised Learning | paper | ICML 2022 | Unviersity of Oxford | 31 Jan 2021 |
11 | CAE | Context Autoencoder for Self-Supervised Representation Learning | paper | arXiv | Peking University | 7 Feb 2022 |
12 | CIM | Corrupted Image Modeling for Self-Supervised Visual Pre-Training | paper code | arXiv | Microsoft | 7 Feb 2022 |
13 | ConvMAE | ConvMAE: Masked Convolution Meets Masked Autoencoders | paper code | arXiv | Shanghai AI Laboratory | 19 May 2022 |
14 | uniform masking | Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality | paper code | arXiv | Nanjing University of Science and Technology | 20 May 2022 |
15 | LoMaR | Efficient self-supervised learning with local masked reconstruction | paper code | arXiv | KAUST | 1 Jun 2022 |
16 | M3AE | Multimodal Masked Autoencoders Learn Transferable Representations | paper | arXiv | UCB | 31 May 2022 |
17 | HiViT | HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling | paper | arXiv | University of Chinese Academy of Sciences | 30 May 2022 |
18 | GreenMiM | Green Hierarchical Vision Transformer for Masked Image Modeling | paper code | arXiv | The University of Tokyo | 26 May 2022 |
19 | A^2MIM | Architecture-Agnostic Masked Image Modeling – From ViT back to CNN | paper | arXiv | AI Lab, Westlake University | 1 Jun 2022 |
20 | MixMIM | MixMIM: Mixed and Masked Image Modeling for Efficient Visual Representation Learning | paper code | arXiv | SenseTime Research | 28 May 2022 |
21 | SemMAE | SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders | paper | arXiv | Chinese Academy of Sciences | 21 Jun 2022 |
22 | Voxel-MAE | Voxel-MAE: Masked Autoencoders for Pre-training Large-scale Point Clouds | paper code | arXiv | Peking University | 20 Jun 2022 |
23 | BootMAE | Bootstrapped Masked Autoencoders for Vision BERT Pretraining | paper code | ECCV 2022 | University of Science and Technology of China | 14 Jul 2022 |
24 | OmniMAE | OmniMAE: Single Model Masked Pretraining on Images and Videos | paper code | arXiv | Meta AI | 16 Jun 2022 |
25 | SatMAE | SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery | paper | arXiv | Stanford University | 17 Jul 2022 |
26 | CMAE | Contrastive Masked Autoencoders are Stronger Vision Learners | paper | arXiv | University of Science and Technology | 27 Jul 2022 |
27 | BEiT v2 | BEIT V2: Masked Image Modeling with Vector-Quantized Visual Tokenizers | paper | arXiv | University of Chinese Academy of Sciences | 12 Aug 2022 |
28 | BEiT v3 | Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks | paper | arXiv | Microsoft Corporation | 22 Aug 2022 |
Self-supervised Learning for Videos
No. | Model Name | Title | Links | Pub. | Organization | Release Time |
---|---|---|---|---|---|---|
1 | VideoMAE | VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training | paper code | arXiv | Tencent AI Lab | 23 Mar 2022 |
2 | MAE in Video | Masked Autoencoders As Spatiotemporal Learners | paper | arXiv | Meta | 18 May 2022 |
Self-supervised Learning for Audios
No. | Model Name | Title | Links | Pub. | Organization | Release Time |
---|---|---|---|---|---|---|
1 | AudioMAE | Masked Autoencoders that Listen | paper code | arXiv | Meta AI | 13 Jul 2022 |
Survey in self-supervised learning
No. | Title | Links | Pub. | Organization | Release Time |
---|---|---|---|---|---|
1 | A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond | paper | arXiv | KAIST | 30 Jul 2022 |