Skip to content

Latest commit

 

History

History
52 lines (42 loc) · 6.74 KB

image-transformer.md

File metadata and controls

52 lines (42 loc) · 6.74 KB

Image Classification

No. Model Name Title Links Pub. Organization Release Time
1 ViT An image is worth 16 * 16 words: transformers for image recognition at scale paper code ICLR 2021 Google Brain 22 Oct 2020
2 LeViT LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference paper arXiv / 2 Apr 2021
3 Swin Transformer Swin Transformer: Hierarchical Vision Transformer using Shifted Windows paper code arXiv MSRA 25 Mar 2021
4 DeiT Transformer Training data-efficient image transformers& distillation through attention paper code arXiv Facebook AI 15 Jan 2021
5 Pyramid Vision Transformer Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions paper code arXiv Nanjing University of Science and Technology 24 Feb 2021
6 TNT Transformer in Transformer paper code arXiv Noah's Ark Lab 27 Feb 2021
7 PiT Rethinking Spatial Dimensions of Vision Transformers paper code arXiv NAVER AI Lab 30 Mar 2021
8 T2T-ViT Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet paper code arXiv NUS 22 Mar 2021
9 CPVT Conditional Positional Encodings for Vision Transformers paper code arXiv Meituan Inc 18 Mar 2021
10 ViL Multi-Scale Vision Longformer:A New Vision Transformer for High-Resolution Image Encoding paper arXiv Microsoft Corporation 29 Mar 2021
11 CoaT Co-Scale Conv-Attentional Image Transformer paper code arXiv University of California San Diego 13 April 2021
12 CoaT Co-Scale Conv-Attentional Image Transformer paper code arXiv University of California San Diego 13 April 2021
14 pruning Visual Transforemr Pruning paper arXiv Zhejiang University 17 April 2021
15 ViL Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding paper arXiv Microsoft Corporation 29 Mar 2021
16 M2TR M2TR: Multi-modal Multi-scale Transformersfor Deepfake Detection paper arXiv Fudan Univeristy 21 Apr
17 VisTransformer Visformer: The Vision-friendly Transformer paper code arXiv Beihang University 26 April 2021
18 ConTNet ConTNet: Why not use convolution and transformer at the same time? paper code arXiv ByteDance AI Lab 27 Apr 2021
19 Twins-SVT Twins: Revisiting the Design of Spatial Attention in Vision Transformers paper code arXiv Meituan Inc 28 Apr 2021
20 LeViT LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference paper code arXiv Facebook 6 May 2021
21 CoAtNet CoAtNet: Marrying Convolution and Attentionfor All Data Sizes paper arXiv Google Brain 9 June, 2021
22 Focal Transformer Focal Self-attention for Local-Global Interactions in Vision Transformers paper Microsoft Research at Redmond 1 Jul 2021
23 BEIT BEIT: BERT Pre-Training of Image Transformers paper arXiv Microsoft 15 Jun 2021
24 ViT-G Scaling Vision Transformers paper arXiv google brain 8 Jun 2021
25 - Efficient Training of Visual Transformers with Small-Size Datasets paper arXiv TFBK 7 Jun 2021
26 PS-ViT Vision Transformer with Progressive Sampling paper code arXiv Centre for Perceptual and Interactive Intelligence 3 Aug 2021
27 - Masked Autoencoders Are Scalable Vision Learners paper arXiv Facebook FAIR 11 Nov 2021
28 Evo-ViT Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer paper AAAI 2022 Chinese Academy of Sciences 6 Dec 2021
29 ATS ATS: Adaptive Token Sampling For Efficient Vision Transformers paper arXiv Microsoft 30 Nov 2021
30 AdaViT AdaViT: Adaptive Vision Transformers for Efficient Image Recognition paper arXiv Fudan University 30 Nov 2021
31 PeCo PeCo : Perceptual Codebook for BERT Pre-training of Vision Transformers paper code arXiv University of Science and Technology of China 24 Nov 2021
32 DAT Vision Transformer with Deformable Attention paper code arXiv Tsinghua University 3 Jan 2022

Viusal Relationship Detection

No. Model Name Title Links Pub. Organization Release Time
1 RelTransformer RelTransformer: Balancing the Visual Relationship Detection from Local Context, Scene and Memory paper code arXiv KAUST 24 April 2021

Object Tracking

No. Model Name Title Links Pub. Organization Release Time
1 MOTR MOTR: End-to-End Multiple-Object Tracking with TRansformer paper code arXiv MEGVII Techonology 7 May 2021