Struggling with Overfitting in Vision Transformer for Food Image Classification #1103
Unanswered
magnifiques
asked this question in
Q&A
Replies: 2 comments
-
If you are doing from scratch, Transformer requires substantial amount of data in order to converge and produce best fit model. Try changing optimizer to AdamW with weight decay and add schedulers like LRStepDecay or Cosine Annealing with Warmup Reset. It might show a little improvement. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks! I will try AdamW to see if it improves the performance. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone,
I'm currently training a Vision Transformer (ViT) on a food image dataset. I've initialized the model with ImageNet2k weights and used the default model transforms. However, I'm encountering an overfitting issue: while the training loss decreases steadily across 5-10 epochs, the test loss remains stagnant or doesn't improve at all.
Here's what I've tried so far:
Weight decay: added to control the complexity of the model
Data augmentation: applied standard transformations like random cropping, flipping, etc.
Dropout: added dropout with a rate of 0.5 to regularize the model
Despite these efforts, I haven't seen any improvement in the test loss. Has anyone faced similar issues, or can someone suggest additional techniques to combat overfitting in ViTs?
Any advice or insights would be greatly appreciated!
Beta Was this translation helpful? Give feedback.
All reactions