This is the official repository accompanying the CVPR Workshop paper:
R. Hesse, S. Schaub-Meyer, and S. Roth. Content-Adaptive Downsampling in Convolutional Neural Networks. CVPRW, The 6th Efficient Deep Learning for Computer Vision (ECV) Workshop, 2023.
Paper | Preprint (arXiv) | Video | Poster | Supplemental
Model | mIoU Cityscapes | Download |
---|---|---|
ResNet101+DeepLabv3 (OS=16) | 0.762 | best_deeplabv3_resnet101_cityscapes_os16_seed1.pth |
ResNet101+DeepLabv3 (OS=8) | 0.776 | best_deeplabv3_resnet101_cityscapes_os8_seed1.pth |
ResNet101+DeepLabv3 edge (OS=8->16) | 0.773 | best_deeplabv3_batch_ap_resnet101_cityscapes_os8_modeedges_os16till8_seed2_trimapwidth11_threshold0.15.pth |
ResNet101+DeepLabv3 learned (OS=8->16) | 0.775 | best_deeplabv3_ad_resnet101_cityscapes_modeend2end_seed0_default_tau1.0_lowresactive0.5_w_downsample_shared_andbatchnorm_shared.pth |
Specify the model architecture with '--model ARCH_NAME' and set the output stride using '--output_stride OUTPUT_STRIDE'. We here show example runs for ResNet101+DeepLabv3.
Current channels:
- https://conda.anaconda.org/conda-forge/linux-64
- https://conda.anaconda.org/conda-forge/noarch
- https://conda.anaconda.org/pypi/linux-64
- https://conda.anaconda.org/pypi/noarch
- https://conda.anaconda.org/anaconda/linux-64
- https://conda.anaconda.org/anaconda/noarch
- https://conda.anaconda.org/pytorch/linux-64
- https://conda.anaconda.org/pytorch/noarch
- https://repo.anaconda.com/pkgs/main/linux-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/r/linux-64
- https://repo.anaconda.com/pkgs/r/noarch
conda create --name adaptive_downsampling --file requirements.txt conda activate adaptive_downsampling
/datasets
/cityscapes
/gtFine
/leftImg8bit
For baseline models in Sec 4.2:
python main.py --model deeplabv3_resnet101 --dataset cityscapes --gpu_id 0 --lr 0.1 --crop_size 768 --batch_size 8 --output_stride 16 --data_root /datasets/cityscapes --random_seed 0
python main.py --model deeplabv3_resnet101 --dataset cityscapes --gpu_id 0 --lr 0.1 --crop_size 768 --batch_size 8 --output_stride 8 --data_root /datasets/cityscapes --random_seed 0
For content-adaptive downsampling models in Sec 4.2:
Adaptive downsampling with edge mask:
python main.py --model deeplabv3_batch_ap_resnet101 --dataset cityscapes --gpu_id 0 --lr 0.1 --crop_size 768 --batch_size 8 --output_stride 8 --data_root /datasets/cityscapes --trimap_width 11 --pooling_mask_mode edges_os16till8 --pooling_mask_edge_detection_treshold [0.15, 0.35, 0.95] --random_seed 0 --exp_name trimapwidth11_threshold[0.15, 0.35, 0.95]
Adaptive downsampling with learned mask:
python main_e2e_train.py --model deeplabv3_ad_resnet101 --dataset cityscapes --gpu_id 0 --lr 0.1 --crop_size 768 --batch_size 8 --data_root /datasets/cityscapes --random_seed 0 --exp_name default_tau1.0_lowresactive0.5_w_downsample_shared_andbatchnorm_shared --val_interval 100 --tau 1 --low_res_active 0.5
For evaluation:
python main_e2e_eval.py --model deeplabv3_ad_resnet101 --dataset cityscapes --gpu_id 0 --crop_size 768 --data_root /datasets/cityscapes --random_seed 0 --tau 1 --ckpt ./best_deeplabv3_ad_resnet101_cityscapes_modeend2end_seed0_default_tau1.0_lowresactive0.5_w_downsample_shared_andbatchnorm_shared.pth
To evaluate your models run the respective training call (main.py) with the parameters --test_only
and --ckpt
.
Regular downsampling
python main_flops.py --model deeplabv3_resnet101 --dataset cityscapes --gpu_id 0 --output_stride [8,16] --data_root /datasets/cityscapes
Adaptive downsampling edge mask
python main_flops.py --model deeplabv3_ap_resnet101 --dataset cityscapes --gpu_id 0 --output_stride 8 --output_stride_from_trained 8 --data_root /datasets/cityscapes --pooling_mask_mode edges_os16till8 --trimap_width 11 --pooling_mask_edge_detection_treshold [0.15, 0.35, 0.95]
Adaptive downsampling learned mask
python main_e2e_flops.py --model deeplabv3_ad_resnet101 --dataset cityscapes --gpu_id 0 --crop_size 768 --data_root /datasets/cityscapes --random_seed 0 --ckpt ./best_deeplabv3_ad_resnet101_cityscapes_modeend2end_seed0_default_tau1.0_lowresactive0.5_w_downsample_shared_andbatchnorm_shared.pth
This code is built on top of the official implementation of the following paper:
"D2-Net: A Trainable CNN for Joint Detection and Description of Local Features".
M. Dusmanu, I. Rocco, T. Pajdla, M. Pollefeys, J. Sivic, A. Torii, and T. Sattler. CVPR 2019.
For instruction on downloading the dataset please see the 'hpatches_sequences' folder
The model weights can be downloaded by running:
mkdir models
wget https://dusmanu.com/files/d2-net/d2_tf.pth -O models/d2_tf.pth
see ../segmentation
additionally install opencv: pip install opencv-python
extract_features.py
can be used to extract D2 features for a given list of images.
Regular downsampling:
python extract_features.py --gpu_id 0 --image_list_file image_list_hpatches_sequences.txt --model_file models/d2_tf.pth --output_extension .sift_d2net_os[1,2,4,8]_512kpts --output_stride [1,2,4,8] --nr_keypoints 512
Adaptive downsampling (example for dilations 25 51 51):
python extract_features.py --gpu_id 0 --image_list_file image_list_hpatches_sequences.txt --model_file models/d2_tf.pth --output_extension .sift_apd2net_os1_512kpts_dils_25_51_51 --output_stride 1 --nr_keypoints 512 --des APD2Net --dilations 25 51 51
Adaptive downsampling (example for dilations 0 0 31):
python extract_features.py --gpu_id 0 --image_list_file image_list_hpatches_sequences.txt --model_file models/d2_tf.pth --output_extension .sift_apd2net_os4_512kpts_dils_0_0_31 --output_stride 4 --nr_keypoints 512 --des APD2Net --dilations 31
After extracting features, they can be evaluated by running hpatches_sequences/HPatches-Sequences-Matching-Benchmark.ipynb (add the methods that you want to evaluate)
Regular downsampling:
python eval_flops.py --gpu_id 0 --image_list_file image_list_hpatches_sequences.txt --output_stride [1,2,4,8] --nr_keypoints 512
Adaptive downsampling (example for dilations 25 51 51):
python eval_flops.py --gpu_id 0 --image_list_file image_list_hpatches_sequences.txt --output_stride 1 --nr_keypoints 512 --des APD2Net --dilations 25 51 51
Adaptive downsampling (example for dilations 0 0 31):
python eval_flops.py --gpu_id 0 --image_list_file image_list_hpatches_sequences.txt --output_stride 4 --nr_keypoints 512 --des APD2Net --dilations 0 0 31
We would like to thank the contributors of the following repositories for using parts of their publicly available code:
If you find our work helpful please consider citing
@inproceedings{Hesse:2023:CAD,
title = {Content-Adaptive Downsampling in Convolutional Neural Networks},
author = {Hesse, Robin and Schaub-Meyer, Simone and Roth, Stefan},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), The 6$^\text{th}$ Efficient Deep Learning for Computer Vision (ECV) Workshop},
year = {2023}
}