Skip to content
This repository has been archived by the owner on Oct 5, 2023. It is now read-only.

Display: For code generation view, add "correct/incorrect" labels (and potentially execution outputs) #813

Open
neubig opened this issue May 31, 2023 · 1 comment

Comments

@neubig
Copy link
Contributor

neubig commented May 31, 2023

In the code generation view, currently the output code and expected code are displayed.

However, in most code generation datasets, such as HumanEval or Odex, evaluation is performed based on running the code and generating the output and comparing whether the output is correct. Based on this:

  1. At the very least, it should be possible to view whether the generated code was judged as correct by showing a "correct/incorrect" label.
  2. Even better would be the functionality to view:
    1. Expected code
    2. Predicted code
    3. Output of expected code
    4. Output of predicted code
    5. Correctness/incorrectness value

This probably requires the data structure for code output_column and data_column to not be str, but a different data structure that includes the code, output, and correctness value.

@neubig neubig changed the title Display: For code generation view, add "correct/incorrect" labels (and potentially model outputs) Display: For code generation view, add "correct/incorrect" labels (and potentially execution outputs) May 31, 2023
@neubig
Copy link
Contributor Author

neubig commented Jun 21, 2023

Here is an example of what the outputs look like now:

Screen Shot 2023-06-21 at 1 00 53 PM

It would be nice to have the correctness/incorrectness value and error message also able to be displayed as well.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant