Struggling with Overfitting in Vision Transformer for Food Image Classification #1103

magnifiques · 2024-10-02T22:45:33Z

magnifiques
Oct 2, 2024

Hi everyone,

I'm currently training a Vision Transformer (ViT) on a food image dataset. I've initialized the model with ImageNet2k weights and used the default model transforms. However, I'm encountering an overfitting issue: while the training loss decreases steadily across 5-10 epochs, the test loss remains stagnant or doesn't improve at all.

Here's what I've tried so far:

Weight decay: added to control the complexity of the model
Data augmentation: applied standard transformations like random cropping, flipping, etc.
Dropout: added dropout with a rate of 0.5 to regularize the model

Despite these efforts, I haven't seen any improvement in the test loss. Has anyone faced similar issues, or can someone suggest additional techniques to combat overfitting in ViTs?

Any advice or insights would be greatly appreciated!

LuluW8071 · 2024-10-08T02:56:45Z

LuluW8071
Oct 8, 2024

If you are doing from scratch, Transformer requires substantial amount of data in order to converge and produce best fit model. Try changing optimizer to AdamW with weight decay and add schedulers like LRStepDecay or Cosine Annealing with Warmup Reset. It might show a little improvement.

0 replies

magnifiques · 2024-10-08T21:53:30Z

magnifiques
Oct 8, 2024
Author

Thanks! I will try AdamW to see if it improves the performance.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Struggling with Overfitting in Vision Transformer for Food Image Classification #1103

{{title}}

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Struggling with Overfitting in Vision Transformer for Food Image Classification #1103

magnifiques Oct 2, 2024

Replies: 2 comments

LuluW8071 Oct 8, 2024

magnifiques Oct 8, 2024 Author

magnifiques
Oct 2, 2024

LuluW8071
Oct 8, 2024

magnifiques
Oct 8, 2024
Author