Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Did the weight_decay needed? #24

Open
wzn0828 opened this issue Apr 28, 2021 · 2 comments
Open

Did the weight_decay needed? #24

wzn0828 opened this issue Apr 28, 2021 · 2 comments

Comments

@wzn0828
Copy link

wzn0828 commented Apr 28, 2021

Hi, thanks for your wanderful work.

As I use your AdaptiveLossFunction, I found the alpha did not decrease, it keeps the highest value through the training process.

So, I used the weight_decay to the alpha and the scale. However, I think the weight_decay should not be used for the two parameters.

What' your opinion?

@jonbarron
Copy link
Owner

If the alpha value stays large throughout optimization, it sounds like your data doesn't have very many outliers, in which case you'll probably get optimal performance by just allowing alpha to be large. Regularizing alpha to be small does not make much sense to me unless you have a prior belief on the outlier distribution of your data. If you want to control the shape of the loss function, I'd just use the general formulation in general.py, and set alpha to whatever value you want.

@wzn0828
Copy link
Author

wzn0828 commented Apr 28, 2021

Ok, your answer addresses my mystery. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants