-
-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix inline completion bug, add test cases #1002
base: main
Are you sure you want to change the base?
Conversation
packages/jupyter-ai/jupyter_ai/tests/completions/test_handlers.py
Outdated
Show resolved
Hide resolved
Thank you for your contribution! It's been a week since we've last heard from you, so I'm closing this PR due to inactivity. Please feel free to re-open this PR after addressing the review feedback. |
Well, to be fair 3 days to address a review is not that much time if people are contributing in free time. CC @ellisonbg |
Working on the issue... will open this again as soon as I resolve it 😃 |
Yeah, I agree with @krassowski. Jupyter doesn't have any formal standards on this, so each repo and subproject tends to handle it differently. At the same time, I think waiting at least a month or more is pretty normal across Jupyter. At times, some repos have even given people much longer (>1 year), but that quickly makes it difficult to navigate the PR queue. I want to be clear here - I am not making any decisions here, just reflecting how Jupyter has handled this in the past. |
Thanks @hiCXK for your contributions! We do want to go out of our way to welcome and encourage contributors. I will let others reopen the PR, but we look forward to seeing this PR reviewed and merged :-) |
Sorry for the direct ping, I just see that many PR reviews include words like "after discussing internally with the team" so when I saw a pattern of closing PRs based on one week threshold I just wanted to raise this with wider team in case if this was also discussed internally :) To be clear, I am not complaining about having internal reviews of features in addition to public reviews, I think this is awesome that you can do this! I think that it could make sense for this repo to decide on short PR iteration time, but it would be great to have:
:) |
Thanks @krassowski for bringing this up, this is really important. Obviously, we (at AWS) do discuss things internally, but you are right that we shouldn't ever make decisions in that way or even appear to do so. All information and discussions that are used to make decisions about any Jupyter subproject or repo should be public and on GitHub. In terms of decision making, we are bound to follow the Jupyter Decision Making Guide. However, for smaller repos (at least smaller than Lab, Notebook, Hub, etc.) like Jupyter AI it would be helpful to add information to the CONTRIBUTING.md with a summary of how that is meant to play out in this particular project, to help people understand the rough process and timelines for review, merging, etc. I know @dlqqq is working hard to keep up with the fast pace of PRs in this repo and want to acknowledge his amazing work to keep the the PR queue moving, this is an underappreciated part of open source work. I can't believe that this repo has already seen almost 500 PRs merged already – or a relatively small effort that is fantastic. |
@dlqqq please re-open the PR. I have corrected the test cases which should now resolve the CI issue. |
@krassowski Yes, I agree. 3 days is definitely too short, and this was a mistake on my end. I was looking at when the PR was opened, not when your review was given. Thank you for calling out my error and the need for community involvement in establishing contributing standards. 👍 (cc @hiCXK) Can open source contributors re-open PRs on their own, or do they require permission from a maintainer to do so? When I closed this PR, I had thought that this would be a harmless way of reducing the number of open PRs, which helps me identify the ones that require my review more quickly. I certainly didn't mean to "shut the door" on contributors who didn't respond within a certain time. Hope this clarifies my intention here. 🤗 |
@krassowski Thank you for addressing the "elephant in the room" here directly. I think it's really wonderful that we respect & trust each other enough to be able to discuss these hard questions directly & publicly. I fully acknowledge that we can do more to be transparent & earn trust with the open source community. Given the project's substantial growth recently, there have been situations where I/we decided outcomes internally instead of seeking consensus in the interest of time, hoping that our conclusion would be generally accepted. However, this doesn't always work (as proven here 😁), and as @ellisonbg called out, this just isn't the right way to run an open source project. I will meet with Brian and apply his feedback on how to make Jupyter AI more open and transparent in its decision-making. This may take some time given that I'm balancing code review, code contribution, and project governance all at once, so please feel free to open an issue or ping me if you feel that progress isn't being made quickly enough. Again, hope this clarifies things, and thank you for your feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krassowski (regarding this PR) Can you review this since you're more familiar with the inline completion logic? You should have permissions to approve & merge as a member of the jupyterlab/
org, so feel free to do so if you're able.
I think the 'reopen' button wasn't visible to me; it might be only visible to members. As for closing this PR, I understood your intentions 🤗 I should have responded while working on it. Thank you. |
("```python\n\n\n# load json file\nimport\n```", "# load json file\nimport"), | ||
("```python\n # load json file\nimport\n```", "# load json file\nimport"), | ||
("```\n# load json file\nimport\n```", "# load json file\nimport"), | ||
(" ```\n# load json file\nimport\n```", "# load json file\nimport"), | ||
("``` \n# load json file\nimport\n```", "# load json file\nimport"), | ||
( | ||
"\n```python\n# load json file\nimport\n```", | ||
"# load json file\nimport", | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These test cases are not representative of the example that I included in #1001. The point was that user types in instruction like:
# load json file
and the model should follow by responding with something like:
\nimport json
\nwith open('path-to-file', 'r') as f:
\n content = json.load(f)
What want to test is that the initial line has the \n
preserved if it is included in the expected_suggestion
. I do not think that these test cases test this, instead they repeat the previous tests just with different text content.
@hiCXK does it make sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I'm having trouble understanding the usage of response
and expected_suggestion
.
response
is the generated text from the mock LLM and expected_suggestion
is the required correct text.
So, can this be a valid test case?
("```\nimport json\nwith open('path-to-file','r') as f:```", "\nimport json\nwith open('path-to-file','r') as f:")
This way we can check if the initial line's \n
character is preserved.
Please suggest a test case if my approach is incorrect and correct my approach 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, something along these lines. Sometimes it will include the comment prefix sometimes it won't do two cases are needed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so i am adding these test cases:
("```\n# load json file\nimport json\nwith open('path-to-file','r') as f:```", "\n# load json file\nimport json\nwith open('path-to-file','r') as f:"),
("```\nimport json\nwith open('path-to-file','r') as f:```", "\nimport json\nwith open('path-to-file','r') as f:")
are these correct? comment prefix is included. We can also shorten them up because we only have to check the preservation of first \n
character
("```\n# load json file\n```", "\n# load json file"),
("```\ndef function():\n pass\n```", "\ndef function():\n pass"),
Also can you elaborate more on inclusion of comment prefix if my solution is incorrect. I have also gone through the community forum and documentation to find more about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed shortened tests are fine here. Ideally we would have a separate test which checks that:
response = "```\n# load json file\nimport json\nwith open('path-to-file','r') as f:```"
generated for prefix:
# load json file
gives
result = "\nimport json\nwith open('path-to-file','r') as f:"
To check the code in line 46.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the test cases seem correct this time and the function post_process_suggestion
is correct, the tests cases are failing unit tests again. There are AssertionError
for each test case i added. \n
characters are still being stripped off despite changing the function. I must be missing something.
Please suggest the way to resolve them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to write a new tests function rather than add cases to this one. This is because this test function defines prefix = ""
but here we are concerned with a situation where user gets suggestion after some prefix (like a comment with an instruction). For ""
prefix "any-text".startswith(prefix)"
will always be true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krassowski Please have a look at this function which utilizes parameterized test:
@pytest.mark.parametrize(
"prefix,response,expected_result",
[
("", "```\n# load json file\n```", "\n# load json file"),
("", "```\ndef function():\n pass\n```", "\ndef function():\n pass"),
("# load json file\n", "```\n# load json file\nimport json\nwith open('path-to-file','r') as f:```", "\nimport json\nwith open('path-to-file','r') as f:"),
]
)
async def test_newline_preservation_and_prefix_handling(prefix, response, expected_result):
inline_handler = MockCompletionHandler(
lm_provider=MockProvider,
lm_provider_params={
"model_id": "model",
"responses": [response],
},
)
dummy_request = InlineCompletionRequest(
number=1, prefix=prefix, suffix="", mime="", stream=False,
)
await inline_handler.handle_request(dummy_request)
assert len(inline_handler.messages) == 1
suggestions = inline_handler.messages[0].list.items
assert len(suggestions) == 1
result = suggestions[0].insertText
assert result == expected_result
if prefix:
assert not result.startswith(prefix) # result should not include the prefix
When I added this function to test_handlers.py
, it still produces assertion error. How are test_handlers.py
and completion_utils.py
connected ( how is post_process_suggestion
function accessed)? do we need to change some other file like default.py or where InlineCompletionRequest class is defined.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Fixes #1001
@krassowski please check if the edits are correct