You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 5, 2023. It is now read-only.
In the code generation view, currently the output code and expected code are displayed.
However, in most code generation datasets, such as HumanEval or Odex, evaluation is performed based on running the code and generating the output and comparing whether the output is correct. Based on this:
At the very least, it should be possible to view whether the generated code was judged as correct by showing a "correct/incorrect" label.
Even better would be the functionality to view:
Expected code
Predicted code
Output of expected code
Output of predicted code
Correctness/incorrectness value
This probably requires the data structure for code output_column and data_column to not be str, but a different data structure that includes the code, output, and correctness value.
The text was updated successfully, but these errors were encountered:
neubig
changed the title
Display: For code generation view, add "correct/incorrect" labels (and potentially model outputs)
Display: For code generation view, add "correct/incorrect" labels (and potentially execution outputs)
May 31, 2023
In the code generation view, currently the output code and expected code are displayed.
However, in most code generation datasets, such as HumanEval or Odex, evaluation is performed based on running the code and generating the output and comparing whether the output is correct. Based on this:
This probably requires the data structure for code
output_column
anddata_column
to not bestr
, but a different data structure that includes the code, output, and correctness value.The text was updated successfully, but these errors were encountered: