-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][Feature] Support TransFusion #2547
base: dev-1.x
Are you sure you want to change the base?
Conversation
@wep21 Thanks for your awesome contribution, and we're excited to invite you to join core developer community of mmdet3d. Just add our assistant using the WeChat ID: openmmlabwx. When sending the friend request, remember to include the remark "mmsig + Github ID". We will invite you to MMDet3D developer group. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## dev-1.x #2547 +/- ##
========================================
Coverage 47.26% 47.26%
========================================
Files 277 277
Lines 23415 23415
Branches 3655 3655
========================================
Hits 11067 11067
Misses 11635 11635
Partials 713 713
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
||
|
||
@TASK_UTILS.register_module() | ||
class HeuristicAssigner3D(BaseAssigner): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could reuse the implementation in BEVFusion and del this file?
from ..BEVFusion.bevfusion.utils import HungarianAssigner3D
from ..BEVFusion.bevfusion.transformer import TransformerDecoderLayer # noqa F401
from ..BEVFusion.bevfusion.utils import TransFusionBBoxCoder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Is it better to merge these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JingweiZhang12 I pushed the code based on the original repository. I'm currently working on using BEVFusion implementation here, but loss_bbox doesn't decrease compared to the original one.
I guess I made a mistake when integrating your code, so I appreciate it if you could give me some advice to resolve this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wep21 Hi, the original BEVFusion has a potential bug: mit-han-lab/bevfusion#419. This will cause the loss_bbox can not decrease when both using lidar and camera modalities. We have fixed it in the latest branch. Also, the BEVfusion has been merged. You can pull the latest dev-1.x branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JingweiZhang12 I found out the changes from original code cause problem when training transfusion. I reverted the changes here and then loss_bbox and matched ious become sane.
Do you have any thoughts about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, this is because the voxelization output order of the BEVFusion is different from ours. BEVFusion: xyz, Ours: zyx. Here is easily manifested: https://github.com/open-mmlab/mmdetection3d/blob/main/projects/BEVFusion/bevfusion/sparse_encoder.py#L15
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Is it better to adapt the pillar/voxelization output of the TransFusion to the coordinate of BEVFusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not have an influence on the accuracy of the model. The style of the original Transfusion is the same as ours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JingweiZhang12 I would like to use the same coordinate as original, so is it better to create another head implementation in transfusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wep21 Yep, it's better to use the original head implements of TransFusioin to align the training precision easily since the BEVFusion head may modify something and have some potential incompatibilities.
Could you update your progress of implementation of TransFusion according to the bellow items:
|
fc37092
to
2ad5e4a
Compare
|
||
@MODELS.register_module() | ||
class TransFusionDetector(MVXTwoStageDetector): | ||
"""Base class of Multi-modality VoxelNet.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated at fd47bf7
# NOTE: fix | ||
draw_heatmap_gaussian(heatmap[gt_labels_3d[idx]], | ||
center_int[[1, 0]], radius) | ||
# draw_heatmap_gaussian(heatmap[gt_labels_3d[idx]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove lines 729-731 since it's unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_base_ = ['../../../configs/_base_/default_runtime.py'] | ||
custom_imports = dict( | ||
imports=['projects.TransFusion.transfusion'], allow_failed_imports=False) | ||
point_cloud_range = [-51.2, -51.2, -5.0, 51.2, 51.2, 3.0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename this file as transfusion_lidar_pillar02_second_secfpn_8xb2-cyclic-20e_nus-3d.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed config at 615cb1d.
backend_args = None | ||
|
||
model = dict( | ||
type='TransFusionDetector', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type='TransFusionDetector', | |
type='TransFusion', |
Remind to modify the corresponding name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename detector class name at 5608d19
@wep21 Hi, have you aligned the inference precision of TransFusion(only LiDAR modality)? |
004e436
to
18bf4de
Compare
I'm sorry, but not yet. I've tested with only nuscenes-mini. Currently, I don't have enough gpu environment, it may take a time to train and test with nuscenes full dataset. |
That's OK. Suggest you can make sure the results of one forward are the same as that of the official implementation firstly. |
fd47bf7
to
4ad9944
Compare
@wep21 Hi, any progress on TransFusion so far? |
Signed-off-by: wep21 <[email protected]>
Signed-off-by: wep21 <[email protected]>
Signed-off-by: wep21 <[email protected]>
Signed-off-by: wep21 <[email protected]>
Signed-off-by: wep21 <[email protected]>
Signed-off-by: wep21 <[email protected]>
Signed-off-by: wep21 <[email protected]>
Signed-off-by: wep21 <[email protected]>
This reverts commit 2ad5e4a.
Signed-off-by: wep21 <[email protected]>
Signed-off-by: wep21 <[email protected]>
Signed-off-by: wep21 <[email protected]>
4ad9944
to
79415f6
Compare
@sunjiahao1999 I'm sorry for late reply. This is the evaluation result with nuscenes dataset after 20 epoch training.
|
@wep21 Please fix the lint. |
Signed-off-by: wep21 <[email protected]>
@JingweiZhang12 I fixed lint at 2cb5f78. |
@JingweiZhang12 @sunjiahao1999 friendly ping |
fe25f7a
to
2cb5f78
Compare
I was looking for this transfusion code. Thank you. Just to double-check, are the 54.4 and 64.7 values in the table in the paper you posted (#2547 (comment)), 2nd row, left column, the ones you're comparing to (#2547 (comment))? In other words, is it correct that your experiment yielded higher numbers than the paper (just for pillar based, lidar-only)? Also, I know that the original Transfusion paper used a "fading strategy", and I was wondering if this was applied in your code or in your experiments. Thanks the great work! |
I just checked that the fading strategy is implemented as a custom hook. But what you implemented doesn't seem to use a 2-stage learning method: pretrain only the LiDAR model first and then further train this model on additional modality with camera input added. |
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Please describe the motivation of this PR and the goal you want to achieve through this PR.
import TransFusion as project
Modification
Please briefly describe what modification is made in this PR.
add transfusion implementation
BC-breaking (Optional)
Does the modification introduce changes that break the back-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
Checklist