HuBMAP - Hacking the Human Vasculature

2nd place solution for HubMap 2023 Challenge hosted on Kaggle

This documentation outlines how to reproduce the 2nd place solution for HubMap - Hacking the Human Vasculature

Conda environment setup

conda create --name hubmap python=3.8 -y
conda activate hubmap

# install pytorch
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116

# install mmcv-full
pip install mmcv-full==1.6.0

pip install einops==0.6.1 timm==0.5.4 pandas PyYaml natsort 

# install staintools
conda install -c conda-forge python-spams
pip install staintools

git clone https://github.com/phamvanlinh143/HubMap_2023_2nd_Place_Solution.git
cd HubMap_2023_2nd_Place_Solution
pip install -r requirements.txt
pip install -v -e .

# install mmcls and albumentations==1.3.0 scipy==1.8.1
pip install mmcls==0.25.0 albumentations==1.3.0 scipy==1.8.1

Backbone References:

Architecture	Backbone	Reference	Pretrained-Weight
Cascade Mask R-CNN	CoaT Small	CoaT	coat_small_pretrained
Cascade Mask R-CNN	Swin-T	Swin-Transformer	swin_t_pretrained
Cascade Mask R-CNN	ConvNeXt-T	ConvNeXt	convnext_t_pretrained
Cascade Mask R-CNN	ConvNeXt-S	ConvNeXt	convnext_s_pretrained

Download pretained-weights and save to HubMap_2023_2nd_Place_Solution/pretrained_weights folder

Data Preparation

Please download the following datasets from the Kaggle:

kaggle competitions download -c hubmap-hacking-the-human-vasculature
kaggle datasets download -d phamvanlinh143/hubmap-train-9tiles-crop128
kaggle datasets download -d phamvanlinh143/hubmap-stain-augs
kaggle datasets download -d phamvanlinh143/hubmap-stain-9tiles-augs
kaggle datasets download -d phamvanlinh143/hubmap-coco-datasets

Download datasets and save to HubMap_2023_2nd_Place_Solution/datasets folder

Directory structure should be as follows.

HubMap_2023_2nd_Place_Solution
├── pretrained_weights
│   ├── cascade_mask_rcnn_coat_small_mstrain_480-800_giou_4conv1f_adamw_3x_coco_4f7a069e.pth
│   ├── cascade_mask_rcnn_convnext-s_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco_20220510_201004-3d24f5a4.pth
│   ├── cascade_mask_rcnn_convnext-t_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco_20220509_204200-8f07c40b.pth
│   └── cascade_mask_rcnn_swin_tiny_patch4_window7.pth
├── datasets
│   ├── hm_1cls                     # extracted phamvanlinh143/hubmap-coco-datasets dataset
│   ├── hm_9tiles_crop128_1cls      # extracted phamvanlinh143/hubmap-coco-datasets dataset
│   ├── stain_9tiles_augs           # extracted phamvanlinh143/hubmap-stain-9tiles-augs dataset
│   ├── stain_augs                  # extracted phamvanlinh143/hubmap-stain-augs dataset
│   ├── test                        # extracted hubmap-hacking-the-human-vasculature dataset
│   ├── train                       # extracted hubmap-hacking-the-human-vasculature dataset
│   ├── train_9tiles_crop128        # extracted phamvanlinh143/hubmap-train-9tiles-crop128 dataset
│   ├── cleaned_polygons.jsonl      # ref: https://www.kaggle.com/code/fnands/de-duplicate-labels
│   ├── polygons.jsonl              # extracted hubmap-hacking-the-human-vasculature dataset
│   ├── tile_meta.csv               # extracted hubmap-hacking-the-human-vasculature dataset
│   └── wsi_meta.csv                # extracted hubmap-hacking-the-human-vasculature dataset 
└── other folders (folked from mmdetection)

You can create folders in datasets (it is not necessary to download datasets from phamvanlinh143/*)

Requirement: Directory structure of datasets should be as follows.

HubMap_2023_2nd_Place_Solution
└── datasets
    ├── test                        # extracted hubmap-hacking-the-human-vasculature dataset
    ├── train                       # extracted hubmap-hacking-the-human-vasculature dataset
    ├── cleaned_polygons.jsonl      # ref: https://www.kaggle.com/code/fnands/de-duplicate-labels
    ├── polygons.jsonl              # extracted hubmap-hacking-the-human-vasculature dataset
    ├── tile_meta.csv               # extracted hubmap-hacking-the-human-vasculature dataset
    └── wsi_meta.csv                # extracted hubmap-hacking-the-human-vasculature dataset

How to create `hm_1cls` (COCO Format) and `stain_augs`

    cd hubmap_dataprocessing/
    
    # (optional) create dataset_splits/
    # run notebook data_prepare.ipynb

    # create hm_1cls
    python coco_gen_only_tiles_1cls.py

    # Note: refs_stain.csv => (filtered from tile_meta.csv - ignore dataset 1)
    # create stain_augs
    python gen_stain_only_tiles.py # it will take much time

    cd ..

How to create `train_9tiles_crop128`, `hm_9tiles_crop128_1cls` (COCO Format), and `stain_9tiles_augs`

    # create train_9tiles_crop128
    cd hubmap_dataprocessing/
    python merge_9tiles.py
    python crop128_9tiles.py

    # create hm_9tiles_crop128_1cls
    python coco_gen_9tiles_crop128_1cls.py

    # create stain_9tiles_augs
    python gen_stain_9tiles.py # it will take much time

    # remove anno_9tiles and annos_9tiles_crop128
    rm -rf anno_9tiles
    rm -rf annos_9tiles_crop128
    # remove train_9tiles folder in datasets
    rm -rf ../datasets/train_9tiles
    
    cd ..

Models Training:

Once all the datasets are downloaded and unzipped. You can training each of the models in following steps:

Notes to model configs
    - *_pt.py : pretraining config
    - *_ft.py : finetune config
Notes to naming:
    - only_tiles     : defaut dataset from competition (tile with shape 512x512)
    - 9tiles_crop128 : merging 8 tiles around the original tile. Padding and cropping 128 pixels around the original tile (tile with shape 768x768).

Training only_tiles models (4 backbone: Coat-Small, Swin-T, ConvneXt-T, ConvneXt-S)

Example: Training fold 0 Swin-T
- Step 1: Pretraining for fold 0
```
CUDA_VISIBLE_DEVICES=0 python tools/train.py hubmap_configs/only_tiles/swin_t/cascade_mask_rcnn_swin_t_1cls_ds1_w1l_pt.py
```
- Step 2: Do SWA for pretrained checkpoints (checkpoints were saved at workdir: workdirs/only_tiles/swin_t/ds1_w1l_pt/)
```
python do_swa.py --workdir workdirs/only_tiles/swin_t/ds1_w1l_pt/
```
- Step 3: Finetune for fold 0 (remember to verify pretrained checkpoint after doing swa at step 2)
```
CUDA_VISIBLE_DEVICES=0 python tools/train.py hubmap_configs/only_tiles/swin_t/cascade_mask_rcnn_swin_t_1cls_ds1_w1l_ft.py
```
- Step 4: Do SWA for finetune checkpoints (checkpoints were saved at workdir: workdirs/only_tiles/swin_t/ds1_w1l_ft/)
```
python do_swa.py --workdir workdirs/only_tiles/swin_t/ds1_w1l_ft/
```
  Final weight of Swin-T fold 0: workdirs/only_tiles/swin_t/ds1_w1l_ft/swa_last.pth
Training 9tiles_crop128 model (2 backbone: Coat-Small, Swin-T)

Example: Training fold 1 Coat-Small
- Step 1: Pretraining for fold 1
```
CUDA_VISIBLE_DEVICES=0 python tools/train.py hubmap_configs/9tiles_crop128/coat_small/cascade_mask_rcnn_coat_small_1cls_crop128_ds1_w1r_pt.py
```
- Step 2: Do SWA for pretrained checkpoints (checkpoints were saved at workdir: workdirs/9tiles_crop128/coat_small/ds1_w1r_pt/)
```
python do_swa.py --workdir workdirs/9tiles_crop128/coat_small/ds1_w1r_pt/
```
- Step 3: Finetune for fold 1 (remember to verify pretrained checkpoint after doing swa at step 2)
```
CUDA_VISIBLE_DEVICES=0 python tools/train.py hubmap_configs/9tiles_crop128/coat_small/cascade_mask_rcnn_coat_small_1cls_crop128_ds1_w1r_ft.py
```
- Step 4: Do SWA for finetune checkpoints (checkpoints were saved at workdir: workdirs/9tiles_crop128/coat_small/ds1_w1r_ft/)
```
python do_swa.py --workdir workdirs/9tiles_crop128/coat_small/ds1_w1r_ft/
```
  Final weight of Swin-T fold 1: workdirs/9tiles_crop128/coat_small/ds1_w1r_ft/swa_last.pth

Inference

Inference and ensemble could be found here.

References

https://github.com/open-mmlab/mmdetection

Name		Name	Last commit message	Last commit date
Latest commit History 2,213 Commits
.circleci		.circleci
.dev_scripts		.dev_scripts
.github		.github
configs		configs
demo		demo
docker		docker
docs		docs
hubmap_configs		hubmap_configs
hubmap_dataprocessing		hubmap_dataprocessing
mmdet		mmdet
requirements		requirements
resources		resources
tests		tests
tools		tools
.gitignore		.gitignore
.owners.yml		.owners.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_mmdet.md		README_mmdet.md
README_zh-CN.md		README_zh-CN.md
do_swa.py		do_swa.py
model-index.yml		model-index.yml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HuBMAP - Hacking the Human Vasculature

2nd place solution for HubMap 2023 Challenge hosted on Kaggle

Conda environment setup

Backbone References:

Data Preparation

How to create `hm_1cls` (COCO Format) and `stain_augs`

How to create `train_9tiles_crop128`, `hm_9tiles_crop128_1cls` (COCO Format), and `stain_9tiles_augs`

Models Training:

Inference

References

About

Releases

Packages

Languages

License

phamvanlinh143/HubMap_2023_2nd_Place_Solution

Folders and files

Latest commit

History

Repository files navigation

HuBMAP - Hacking the Human Vasculature

2nd place solution for HubMap 2023 Challenge hosted on Kaggle

Conda environment setup

Backbone References:

Data Preparation

How to create hm_1cls (COCO Format) and stain_augs

How to create train_9tiles_crop128, hm_9tiles_crop128_1cls (COCO Format), and stain_9tiles_augs

Models Training:

Inference

References

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

How to create `hm_1cls` (COCO Format) and `stain_augs`

How to create `train_9tiles_crop128`, `hm_9tiles_crop128_1cls` (COCO Format), and `stain_9tiles_augs`

Packages