You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to reproduce the results for NarrativeQA by directly running the command with the .yml configuration files. Below are the performances measured with ROUGE-L-Max.
For PPO with supervision, I got 0.581 and 0.588 for epochs 0 and 99, respectively.
For NLPO with supervision, I got 0.217 and 0.213 for epochs 0 and 99, respectively.
I'm wondering why the result for NLPO doesn't match the reported result in the paper.
I also tried to use the config for PPO, and just modify the RL algorithm to NLPO, I got the same result as above.
Please let me know if I'm missing something or if it's some other issue. Thanks!
The text was updated successfully, but these errors were encountered:
I'm trying to reproduce the results for NarrativeQA by directly running the command with the .yml configuration files. Below are the performances measured with ROUGE-L-Max.
For PPO with supervision, I got 0.581 and 0.588 for epochs 0 and 99, respectively.
For NLPO with supervision, I got 0.217 and 0.213 for epochs 0 and 99, respectively.
I'm wondering why the result for NLPO doesn't match the reported result in the paper.
I also tried to use the config for PPO, and just modify the RL algorithm to NLPO, I got the same result as above.
Please let me know if I'm missing something or if it's some other issue. Thanks!
The text was updated successfully, but these errors were encountered: