Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential lookahead lexer bug? #158

Open
yaacovCR opened this issue Nov 10, 2024 · 2 comments
Open

potential lookahead lexer bug? #158

yaacovCR opened this issue Nov 10, 2024 · 2 comments

Comments

@yaacovCR
Copy link

Revisiting graphql/graphql-js#2764

I'm not quite sure the reference implementation has a bug, see failing test when implementing the fix as advised:

graphql/graphql-js#4293

@yaacovCR
Copy link
Author

Which would imply that the implementation here has a potential bug, as it assumed that the reference implementation was flawed:

// restore these since both `positionAfterWhitespace` & `readBlockString`
// can potentially modify them and commment for `lookahead` says no lexer modification.
// (the latter is true in the canonical js lexer also and is likely a bug)
line = savedLine
lineStart = savedLineStart

@yaacovCR
Copy link
Author

Basically, when the comment says the lexer start is not changed, I think it means with respect to the current token only, but not with respect to other internal state.

In fact, I think there is an optimization that means additional lexer state can and must change.

Specifically, when we look ahead, we save the next token within the linked list, even though we don't modify the current token pointer, such that when we advance() after calling lookahead() we don't have to reprocess the next token, we just read from token.next. (advance() works by calling lookahead() and then specifically advancing the current token pointer.)

Considering that we never re-lex a token, the line and start position of the line within the body must be permanently advanced by lookahead() otherwise the lexer will think that is on the old line. From what I can tell, parsing will continue as normal, because we lex from the end position of the last token, and that won't be incorrect, but every token from that point on will have the wrong line number and presumably the wrong column.

I don't see the corresponding lexSecond tests from graphql-js in https://github.com/GraphQLSwift/GraphQL/blob/main/Tests/GraphQLTests/LanguageTests/LexerTests.swift but I assume that's where you would find the bug, if not in more involved tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant