Skip to content

nightly-2024-10-08: feat: handwritten parser (#6180)

Pre-release
Pre-release
Compare
Choose a tag to compare
@github-actions github-actions released this 08 Oct 02:29
· 356 commits to master since this release
c4273a0
# Description

Resolves https://github.com/noir-lang/noir/issues/853

## Problem

There are some issues with the current parser:
- it leads to "stack overflow" with small programs that should probably
compile (like a program with a chanin of 17 `if-else-if`)
- it leads to some of us not being able to run the noric_frontend tests
because of linker errors, and sometimes making changes to the parser
will lead to linker errors that we have to workaround
- it's (very) slow

## Summary

This PR implements a hand-written parser. It parses any program with
just one look-ahead token. It has very good error-recovery.

I tested the parser's performance by copying the contents of
`noir-contracts/contracts/avm_test_contract/src/main.nr` in
Aztec-Packages 100 times to a single file. That ends up with a file of
about 57K lines. The times:
- chumsky parser: 1.52 **seconds**
- handwritten parser: 52.97 **milliseconds**

Some other benefits:
- The linker errors are gone!
- Compiling noirc_frontend is slightly faster
- Macro code also becomes faster (`quote { ... }.as_expr()`, etc, invoke
the parser). For example
`test_programs/noir_test_success/comptime_expr/src/main.nr` takes around
one second with chumsky and 140ms with the handwritten parser (to do
`nargo compile`)
- Even though the parser is handwritten, I think the parsing code is
relatively simple. It's just "check if we get this token, then do this"
or sometimes "check if we get this token followed by this one (or not
followed by this one). Also also the `impl`s and `traits` that we needed
for chumsky (and the lifetimes, and passing parsers around, and cloning
them, and calling `boxed()`, etc.) are gone, which I believe make the
code much simpler. That said, chumsky has great helpers to be able to
parse things separated by, say, a comma, and this PR at least has that
too (`parse_many`).
- Compiling an empty program is faster (goes from 650ms to 140ms)
- Compiing any program is much faster
- Tests run faster (it would become feasible to run tests locally before
pushing to CI to avoid CI cycles):
  - Running noirc_frontend tests:
    - before: 1:03 minute
    - after: 6 seconds
  - Running lsp tests: 
    - before: 55 seconds
    - after: 6 seconds
  - Running nargo_cli tests:
    - before: 2:47 minutes
    - after: 38 seconds
- CI runs faster (for example each of the four partitions take 1 minute
instead of 4 minutes
- Building the compiler is faster:
  - before: 1:29 minutes
- after: 1:19 minutes (so building noirc_frontend is 10 seconds faster
because that's the only thing changed in this PR)
- Better parsing recovery and more fine-grained control over the errors
we report

I tested this parser by running `./boostrap.sh` on the Aztec-Packages
contracts and they compile file (of course they didn't compile right
away, I had to fix some bugs to get there).

## Additional Context

## Documentation

Check one:
- [x] No documentation needed.
- [ ] Documentation included in this PR.
- [ ] **[For Experimental Features]** Documentation to be submitted in a
separate PR.

# PR Checklist

- [x] I have tested the changes locally.
- [x] I have formatted the changes with [Prettier](https://prettier.io/)
and/or `cargo fmt` on default settings.

---------

Co-authored-by: Tom French <[email protected]>
Co-authored-by: jfecher <[email protected]>