Inconsistent Evaluation Results on Slake1.0 and PathVQA Datasets #9

taindp98 · 2024-09-17T05:03:48Z

Hello,

I attempted to replicate the evaluation results presented in the paper using two datasets: Slake1.0 and PathVQA. For this process, I utilized the released data at the provided URL: #6 (comment). However, my results do not match those reported in the paper. Below are the details:

Slake1.0 Dataset: It appears that the checkpoint provided is finetuned without pretraining on the MedTrinity-25M dataset, as my results are very close to the results of LLaVA-Med++ (Ours, w/o) in Table 3 of the paper.
PathVQA Dataset: For the Closed set, I was able to replicate the accuracy as expected. However, in the Open set, the recall was significantly lower than the published results.

To help diagnose these issues, I have attached two images for reference. Each image corresponds to the evaluation process on the two datasets mentioned above.

Could you kindly verify whether the provided fine-tuning checkpoint for Slake1.0 is correct? Additionally, it would be helpful to understand any specific steps necessary to replicate the reported recall values for the PathVQA Open set.

Thank you in advance for your assistance!

yunfeixie233 · 2024-09-25T07:51:21Z

Hi @taindp98,

I apologize for any inconvenience. I will review the issues shortly.

jinghaoliu · 2024-12-21T12:12:37Z

Hi @taindp98,

I apologize for any inconvenience. I will review the issues shortly.

Hi Yunfei,

Thank you for your great work. I was just wondering if this issue has been resolved yet? I seem to be getting similar results like taindp98.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent Evaluation Results on Slake1.0 and PathVQA Datasets #9

Inconsistent Evaluation Results on Slake1.0 and PathVQA Datasets #9

taindp98 commented Sep 17, 2024 •

edited

Loading

yunfeixie233 commented Sep 25, 2024

jinghaoliu commented Dec 21, 2024

Inconsistent Evaluation Results on Slake1.0 and PathVQA Datasets #9

Inconsistent Evaluation Results on Slake1.0 and PathVQA Datasets #9

Comments

taindp98 commented Sep 17, 2024 • edited Loading

yunfeixie233 commented Sep 25, 2024

jinghaoliu commented Dec 21, 2024

taindp98 commented Sep 17, 2024 •

edited

Loading