This repo is the official project repository of the paper [SFPNet] (ECCV 2024) and dataset [S.MID].
SFPNet: For the backbone of LiDAR segmentation task, the three main challenges it needs to address are sparsity, large scale, and non-uniform changes in point cloud density. Previous works introduced inductive bias (special partition or special window and position encoding) to a single type of LiDAR (usually mechanical spinning LiDAR) to solve the above three challenges . This can limit model generalizability to other kinds of LiDAR and make hyperparameter tuning more complex. Therefore, we propose the pipeline SFPNet with favorable properties (See more in our paper) to address the three main challenges in various type of LiDAR point cloud data while avoiding the introduction of special inductive bias. Our core module SFPM can replace window transformer block.
Comparison of field of view: Comparison of cumulative point clouds:
S.MID: We used an industrial robot (equipped with Livox Mid-360) to collect a total of 38904 frames of hybrid-solid LiDAR data in different substations. We annotated 25 categories under professional guidance and merged them into 14 classes for single-frame segmentation task.
Examples of labeled cumulative point clouds in S.MID:
The S.MID dataset is published under CC BY-NC-SA 4.0 license, which means that anyone can use this dataset for non-commercial research purposes.
You can quickly get an overview of our work through a short video.
Download and create environment.
git clone https://github.com/Cavendish518/SFPNet.git
conda create --name sfpn python=3.9
conda activate sfpn
Install dependencies.
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install torch_scatter==2.0.9
pip install torch_geometric==1.7.2
pip install spconv-cu114==2.1.21
pip install torch_sparse==0.6.12 cumm-cu114==0.2.8 torch_cluster==1.5.9
pip install tensorboard timm termcolor tensorboardX
pip install numpy==1.25.0
Note: The versions of some libraries are not very strict and can be adjusted according to the requirements file.
Demo for S.MID:
python demo_SMID.py --config config/demo/Mid_demo.yaml
Model weights library
dataset | Val mIoU (tta) | Download |
---|---|---|
S.MID_v1.3 | 72.8 (new baseline) | S.MID baseline |
S.MID_beta_v1.2 | 71.9 (original paper version) | S.MID Weight2 |
nuScenes | 80.1 | nuScenes Weight |
SemanticKITTI | 69.2 | SemanticKITTI Weight |
If your are using a newer GPU, please refer to issue#4.
To facilitate the use of LiDAR semantic segmentation in downstream tasks, we provide ROS tools.
Please note that our work focuses on the representational capabilities of the network design itself. Therefore, related powerful training techniques (novel data augmentation, post-processing and distillation) have not been used in the experiments. But these are important parts of research. So we encourage you to try related techniques. We will be happy if you train better model weights. If you are willing to make the model weights and codes public, please contact us.
Download S.MID from [S.MID]. Details about dataset can be found in the webpage and our paper. An optimized and denoised version v1.3 has been released (see issue#1).
For constructing the code for dataloader, you can refer to the implementation in the demo.
Additional dependencies for dataset tools.
pip install matplotlib
pip install open3d
Statistics of label distribution.
python Mid_label_distribution.py /path_to_dataset/SMID_v1_3/sequences/01
Visualization of ground truth (single frame).
python Mid_vis_cloud.py --vis_type ground_truth --pt_path /path_to_dataset/SMID_v1_3/sequences/01/hybrid/001000.bin --gt_path /path_to_dataset/SMID_v1_3/sequences/01/labels/001000.label
Visualization of prediction (your prediction in ground truth format). You can modify the code to fit your storage format.
python Mid_vis_cloud.py --vis_type prediction --pt_path /path_to_dataset/SMID_v1_3/sequences/01/hybrid/001000.bin --pd_path /path_to_prediction/001000.label
Visualization of difference (your prediction in ground truth format). You can modify the parameters for better view.
python Mid_vis_cloud.py --vis_type difference --pt_path /path_to_dataset/SMID_v1_3/sequences/01/hybrid/001000.bin --gt_path /path_to_dataset/SMID_v1_3/sequences/01/labels/001000.label --pd_path /path_to_prediction/001000.label
- Release model code of SFPNet;
- Release S.MID and instruction in our dataset webpage;
- Release code of Demo (validation);
- Release dataset tools;
- Release model weights;
- S.MID
- nuScenes
- SemanticKITTI
- Release code of training;
- Release ROS tools.
We would like to thank all the pioneers SemanticKITTI, FocalNet, SphereFormer, Cylinder3D and nuScenes.
If your like our paper, codes or dataset, please cite us and give this repo a star.
@inproceedings{wang2025sfpnet,
title={Sfpnet: Sparse focal point network for semantic segmentation on general lidar point clouds},
author={Wang, Yanbo and Zhao, Wentao and Cao, Chuan and Deng, Tianchen and Wang, Jingchuan and Chen, Weidong},
booktitle={European Conference on Computer Vision},
pages={403--421},
year={2025},
organization={Springer}
}