Skip to content

Commit

Permalink
Fix a token incompatibility for Prism::Translation::Parser::Lexer
Browse files Browse the repository at this point in the history
This PR fixes a token incompatibility between Parser gem and `Prism::Translation::Parser` for left parenthesis.

## Parser gem (Expected)

Returns `tLPAREN2` token:

```console
$ bundle exec ruby -Ilib -rparser/ruby33 \
-ve 'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "foo(:bar)"; p Parser::Ruby33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 0...3>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tSYMBOL, ["bar", #<Parser::Source::Range example.rb 4...8>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 8...9>]]]
```

## `Prism::Translation::Parser` (Actual)

Previously, the parser returned `tLPAREN` token when parsing the following:

```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "foo(:bar)"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 0...3>]], [:tLPAREN, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tSYMBOL, ["bar", #<Parser::Source::Range example.rb 4...8>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 8...9>]]]
```

After the update, the parser now returns `tLPAREN2` token for the same input:

```console
$ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "foo(:bar)"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
ruby 3.4.0dev (2024-09-01T11:00:13Z master eb144ef91e) [x86_64-darwin23]
[[:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 0...3>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
[:tSYMBOL, ["bar", #<Parser::Source::Range example.rb 4...8>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 8...9>]]]
```

The `PARENTHESIS_LEFT` token in Prism is classified as either `tLPAREN` or `tLPAREN2` in the Parser gem.
The tokens that were previously all classified as `tLPAREN` are now also classified to `tLPAREN2`.

With this change, the following code could be removed from `test/prism/ruby/parser_test.rb`:

```diff
-          when :tLPAREN
-            actual_token[0] = expected_token[0] if expected_token[0] == :tLPAREN2
```
  • Loading branch information
koic committed Sep 7, 2024
1 parent 8a3fa9f commit 9ddaf19
Show file tree
Hide file tree
Showing 4 changed files with 78 additions and 22 deletions.
13 changes: 11 additions & 2 deletions lib/prism/translation/parser/lexer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ class Lexer
MINUS_GREATER: :tLAMBDA,
NEWLINE: :tNL,
NUMBERED_REFERENCE: :tNTH_REF,
PARENTHESIS_LEFT: :tLPAREN,
PARENTHESIS_LEFT: :tLPAREN2,
PARENTHESIS_LEFT_PARENTHESES: :tLPAREN_ARG,
PARENTHESIS_RIGHT: :tRPAREN,
PERCENT: :tPERCENT,
Expand Down Expand Up @@ -187,7 +187,14 @@ class Lexer
EXPR_BEG = 0x1 # :nodoc:
EXPR_LABEL = 0x400 # :nodoc:

private_constant :TYPES, :EXPR_BEG, :EXPR_LABEL
# The `PARENTHESIS_LEFT` token in Prism is classified as either `tLPAREN` or `tLPAREN2` in the Parser gem.
# The following token types are listed as those classified as `tLPAREN`.
LPAREN_CONVERSION_TOKEN_TYPES = [
:kBREAK, :kCASE, :tDIVIDE, :kFOR, :kIF, :kNEXT, :kRETURN, :kUNTIL, :kWHILE, :tAMPER, :tANDOP, :tBANG, :tCOMMA, :tDOT2, :tDOT3,
:tEQL, :tLPAREN, :tLPAREN2, :tLSHFT, :tNL, :tOP_ASGN, :tOROP, :tPIPE, :tSEMI, :tSTRING_DBEG, :tUMINUS, :tUPLUS
]

private_constant :TYPES, :EXPR_BEG, :EXPR_LABEL, :LPAREN_CONVERSION_TOKEN_TYPES

# The Parser::Source::Buffer that the tokens were lexed from.
attr_reader :source_buffer
Expand Down Expand Up @@ -268,6 +275,8 @@ def to_a
value.chomp!(":")
when :tLCURLY
type = :tLBRACE if state == EXPR_BEG | EXPR_LABEL
when :tLPAREN2
type = :tLPAREN if !tokens.last || LPAREN_CONVERSION_TOKEN_TYPES.include?(tokens.last[0])
when :tNTH_REF
value = parse_integer(value.delete_prefix("$"))
when :tOP_ASGN
Expand Down
1 change: 1 addition & 0 deletions test/prism/fixtures/unparser/corpus/literal/unary.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
-a
+a
-(-a).foo
+(+a).foo
2 changes: 0 additions & 2 deletions test/prism/ruby/parser_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -271,8 +271,6 @@ def assert_equal_tokens(expected_tokens, actual_tokens)
case actual_token[0]
when :kDO
actual_token[0] = expected_token[0] if %i[kDO_BLOCK kDO_LAMBDA].include?(expected_token[0])
when :tLPAREN
actual_token[0] = expected_token[0] if expected_token[0] == :tLPAREN2
when :tPOW
actual_token[0] = expected_token[0] if expected_token[0] == :tDSTAR
end
Expand Down
84 changes: 66 additions & 18 deletions test/prism/snapshots/unparser/corpus/literal/unary.txt

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 9ddaf19

Please sign in to comment.