-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why not choose the breakpoint with lowest test perplexity? #5
Comments
Criterion of convergence is that the perplexity has stabilized not that it reaches a lowest point. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Recently, I've been reproducing the paper's result using the oringinal data and this code.
The picture is a visulization from tensorboard, with record of testing perplexity from every 500 step. And from this picture, I noticed that the last step (ppl:37.76) doesn't have the lowest test perplexity (ppl:25.11). However, the value of the last step consists with paper's result(ppl:36.9) .
So, why not choose the breakpoint with lowest test perplexity? Or, what is the criterion of the convergence of the model?
The text was updated successfully, but these errors were encountered: