Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train on your own 512 x 512 size image #32

Open
zts12 opened this issue Sep 21, 2024 · 12 comments
Open

Train on your own 512 x 512 size image #32

zts12 opened this issue Sep 21, 2024 · 12 comments

Comments

@zts12
Copy link

zts12 commented Sep 21, 2024

For 512*512 images, do I need to modify the config settings for training on my own dataset? (This is the SpaceNet configuration). Also, do you have any advice on the training batch, the results of 30 sessions are not ideal? Thank you very much for your work on the SAM-Road project, and look forward to your answers in your busy schedule, thank you.

@htcr
Copy link
Owner

htcr commented Sep 24, 2024

I think the CityScale setup defaults to 512x512 patch size. Can you try that? Batch size depends on your GPU memory, I think you can start with the largest batch size you can get away with - then tune LR properly to make sure it converges. It may need some trail-and-error.

@zts12
Copy link
Author

zts12 commented Sep 24, 2024

Thank you very much for your answer, this is the result of my current road extraction with my own 0.5 meter resolution image using the configuration of SpaceNet, not very ideal, do you have any suggestions?
Uploading bj000035_mask_road_mask.png…

@zts12
Copy link
Author

zts12 commented Sep 24, 2024

bj000036_mask_road_mask
bj000041_mask_road_mask
bj000046_mask_road_mask
bj000049_mask_road_mask
Uploading bj000056_mask_road_mask.png…

@htcr
Copy link
Owner

htcr commented Sep 24, 2024

I think our released model takes 1.0m/pixel images. Can you try resizing your images to that resolution? Also, have you fine-tuned on your own dataset? How large was your dataset?

@htcr
Copy link
Owner

htcr commented Sep 24, 2024

Also did you correctly load the pre-trained SAM ckpts?

@zts12
Copy link
Author

zts12 commented Sep 24, 2024

Thank you again for your suggestion, I used the 0.5 meter resolution image to crop it to 512*512 image size, and also loaded the ckpt of the pre-trained SAM, and also adjusted the learning rate, and re-trained, but the effect is not justified by the good effect of the two datasets in the original paper, the dataset has a total of 3065 pieces, 2453 for training, 459 for testing, 153 for verification, according to the scale of Spacenet for the data division Each image is 512 and contains the corresponding required graph data. So if I change the image to 1 meter resolution, will the end result be improved? The results of the test are like this, and it is not clear whether the thresholds of key points and roads should also be modified?
======= Finding best thresholds ======
======= keypoint ======
======= Finding best thresholds ======
======= keypoint ======
Best threshold 0.01090240478515625, P=0.0 R=0.0 F1=nan
======= road ======
Best threshold 0.01090240478515625, P=0.0 R=0.0 F1=nan
======= road ======
Best threshold 0.0965576171875, P=0.0 R=0.0 F1=nan
======= topo ======
Best threshold 0.0965576171875, P=0.0 R=0.0 F1=nan

@htcr
Copy link
Owner

htcr commented Sep 24, 2024

Hi, I think if you are fine-tuning from the original SAM ckpt (not the ones I released), resolution is less crucial. How does the images look like in general? The numbers you shown seems to suggest the model did not converge at all. The size of the dataset sounds reasonable, can you try the following options:

  1. Debug the label generation logic. Does the GT masks look reasonable?
  2. See if the model can just overfit one example. If not, maybe some hyperparameters are wrong.
  3. Try different batch sizes / learning rates.
  4. Apply some data augmentation. In SAM-Road paper, random cropping and rotation were applied.
  5. Try to zero some loss terms to find which one is exploding.

Good luck with your experiments!

@zts12
Copy link
Author

zts12 commented Sep 24, 2024

Sorry for the late reply, thank you for your suggestion, I will follow your suggestion to carry out the experiment, I am a graduate student in a university, and the current direction of study is to use high-resolution remote sensing images for road extraction, thank you for communicating with you, can you add WeChat, my WeChat account is 18837621961, I will be honored.

@EchoQiHeng
Copy link

I trained SAM on the DeepGlobe dataset and the results were convincing, so I believe SAM is robust. Please carefully check your code.

@zts12
Copy link
Author

zts12 commented Oct 8, 2024

Thank you for your work sharing, but I also used DeepGlobe for training and testing, I cropped it to a 512*512 image, and trained and tested, but the result is not very ideal, but the clear road that can be extracted is incomplete The effect is not very good, can I consult your config settings and the division rules of the dataset? Or is there some other modification work and configuration work that I haven't noticed? Thanks for your answer.

I trained SAM on the DeepGlobe dataset and the results were convincing, so I believe SAM is robust. Please carefully check your code.

@EchoQiHeng
Copy link

Thank you for your work sharing, but I also used DeepGlobe for training and testing, I cropped it to a 512*512 image, and trained and tested, but the result is not very ideal, but the clear road that can be extracted is incomplete The effect is not very good, can I consult your config settings and the division rules of the dataset? Or is there some other modification work and configuration work that I haven't noticed? Thanks for your answer.

I trained SAM on the DeepGlobe dataset and the results were convincing, so I believe SAM is robust. Please carefully check your code.

I have demonstrated the visualization results on the DeepGlobe validation set, and I believe the model has converged and is functioning as expected. It seems I did not make any specific configuration settings for DeepGlobe. Of course, modifications to the SatMapDataset were necessary, and my process primarily involved cropping and augmentation. Please carefully check your RGB images and the corresponding GT Mask.
iou
pred
Additionally, I have displayed the IoU during the training process. Please provide more details and results from your experiments to facilitate further debugging.
rgb

@immarshmellow
Copy link

感谢大家的分享,不过我也用 DeepGlobe 进行了训练和测试,我裁剪成了一张 512*512 的图片,进行了训练和测试,但是结果不是很理想,但是可以提取的清路效果不是很好,可以查阅一下你的配置设置和数据集的划分规则吗?或者还有其他一些我没有注意到的修改工作和配置工作?谢谢你的回答。

我在 DeepGlobe 数据集上训练了 SAM,结果令人信服,因此我相信 SAM 是稳健的。请仔细检查您的代码。

我已经在 DeepGlobe 验证集上演示了可视化结果,我相信该模型已经收敛并按预期运行。我似乎没有为 DeepGlobe 进行任何特定的配置设置。当然,对 SatMapDataset 的修改是必要的,我的过程主要涉及裁剪和增强。请仔细检查您的 RGB 图像和相应的 GT 蒙版。 iou pred此外,我在训练过程中还展示了 IoU。请提供实验的更多详细信息和结果,以便于进一步调试。 rgb

Hi, I would like to ask about the exact steps for training and testing with DeepGlobe. I don't understand much of this stuff and appreciate any help that can be provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants