Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference problem on the new trained weights #23

Open
zahrabsh74 opened this issue Oct 12, 2023 · 4 comments
Open

Inference problem on the new trained weights #23

zahrabsh74 opened this issue Oct 12, 2023 · 4 comments

Comments

@zahrabsh74
Copy link

zahrabsh74 commented Oct 12, 2023

First thanks for releasing your nice code. I wanted to fine-tune your model on my own dataset but after finishing the finetuning process there is a problem in using the trained weights for testing and inferencing results. In addition, I couldn't re-train your dataset and use the generated weights. Even though the train finished successfully, in the inferencing I received the following error and I couldn't run the main_test.py.
ERROR:
Screenshot from 2023-10-12 19-45-32

for your better understanding, these are the arguments that I used to run codes.
for training, I ran:

CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node 2 main_train.py --savedir RNGDet --dataroot ./dataset/ --batch_size 8 --ROI_SIZE 128 --nepochs 50 --backbone resnet101 --eos_coef 0.2 --lr 9e-5 --lr_backbone 9e-5 --weight_decay 1e-5 --noise 8 --image_size 2048\
  --candidate_filter_threshold 30 --logit_threshold 0.75 --extract_candidate_threshold 0.55 --alignment_distance 5 --instance_seg --multi_scale --multi_GPU --frozen_weights pretrain_cityscale/RNGDet_multi_ins/RNGDetPP_best.pt

for testing, I ran:

CUDA_VISIBLE_DEVICES=0,1 python main_test.py --savedir RNGDet --image_size 2048\
 --dataroot ./dataset/ --ROI_SIZE 128 --backbone resnet101 --checkpoint_dir RNGDetNet_0.pt\
 --candidate_filter_threshold 30 --logit_threshold 0.7 --extract_candidate_threshold 0.7 --alignment_distance 5 --batch_size 8 \
 --instance_seg --multi_scale --process_boundary

I am kindly asking for your help to solve this problem.

@zahrabsh74
Copy link
Author

In my idea, this problem is in the main_training.py or agent.py codes as each epoch calculated zero extracted_candidate_initial_vertices which are odd!!. Therefore, I am looking into the problem of not finding any vertices in each epoch. even though your existing pre-trained checkpoints are able to extract initial_vertices in each image.

@TonyXuQAQ
Copy link
Owner

Sorry for the late reply.

You may want to check the predicted segmentation map to generate the candidate initial vertices, which is visualized here.

If no candidate initial vertices are predicted, it means the segmentation map is all zero. Since you use data from your own datasets, this may cauzed by different properties of different datasets. You may want to alter some parameters to make the segmentation network work properly.

@zahrabsh74
Copy link
Author

zahrabsh74 commented Oct 19, 2023

Thanks for your reply. I figured out that the problem was because of the low number of images in the dataset, and the model couldn't generate the right checkpoints for them. I thought, for the fine-tuning, the number and volume of the dataset didn't matter so much. but maybe I was wrong.
could you please give me a hint on how I can finetune the model and the pretrained checkpoints that you already Gave access to everyone?
In addition, do you have any idea that at least how many numbers of images are needed for finetuning?

@TonyXuQAQ
Copy link
Owner

Sorry that RNGDet/RNGDet++ is trained on conventional supervised learning, which does not consider pretrain-finetune pipeline. The provided checkpoints are not trained on enough data, so directly finetuning the checkpoints may not produce satisfactory results. I think you might need to enlarge your dataset and train the network from scratch instead of finetuning on a small amount of data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants