Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify generated CST #12

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Simplify generated CST #12

wants to merge 17 commits into from

Conversation

bbannier
Copy link
Member

@bbannier bbannier commented Feb 10, 2023

Currently the generated CST contains many nodes which have the exact same range. Such nodes not only make the generated CST pretty cluttered, they also make it harder than needed to find good matchings between source locations and CST nodes (e.g., in a language server).

This PR somewhat flattens the generated CST so it contains less nodes which are mostly there for developer ergonomics while writing the parser. For that I make them either anonymous nodes, or simplify the grammar. While I was at it I also added explicit CST nodes which previously where not exposed directly.

To make the effects of this PR easier to understand I added a simple corpus. It is still hard to write minimal examples, but observing how the CST changes with each patch maybe makes the changes more transparent.

This is a cleaned up version of #1 which we closed at the time without merged.

@bbannier bbannier self-assigned this Feb 10, 2023
@bbannier
Copy link
Member Author

@ckreibich It looks like the test step is failing for me even locally when running the current main (without any of my patches), so they seem unrelated (regression due to Zeek version bump?). If the changes in this PR are interesting, I could look into that.

@bbannier bbannier marked this pull request as ready for review February 20, 2023 09:43
@bbannier bbannier requested a review from ckreibich February 20, 2023 09:43
@bbannier bbannier force-pushed the topic/bbannier/flatten branch from 8596344 to 2465478 Compare March 18, 2023 10:12
This allows users to use a uniform approach to match for variable
declarations, and also even allows distinguishing declarations `of` local
variables from assignments (in case they have an initializer).
@bbannier bbannier force-pushed the topic/bbannier/flatten branch from 2465478 to e50434e Compare March 18, 2023 10:37
@bbannier
Copy link
Member Author

bbannier commented Mar 18, 2023

I rebased this on a recent main.

The error I talked about previously was introduced by a patch originally in this PR which was making carriage return optional everywhere, i.e., use \r?\n instead of \r\n everywhere possible. This seems to cause different parsing so I dropped the patch instead of spending time investigating. I had not realized this breakage on my machine since the test script test/parse-zeek-tree.sh uses the user-wide tree-sitter configuration instead of something local to the current parser so it was always broken for me.

@bbannier bbannier requested review from ckreibich and removed request for ckreibich March 18, 2023 11:43
This gives the operations a name so they can be matched on by users. We
also fix the definitions so they have same precedence.
We also flatten the nide hierarchy for function-like constructs
(`function`, `hook`, `event`) so they are exposed directly as top-level
declarations.
This node has no additional semantic meaning beyond the `stmt_list` it
wraps.
The previous name sounded a lot like a class initializer which this is
not.
@bbannier bbannier force-pushed the topic/bbannier/flatten branch from e50434e to f0f0698 Compare March 30, 2023 09:38
@bbannier bbannier linked an issue Mar 30, 2023 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zeekygen_next_comment attached to wrong scope
1 participant