Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block's OpenBraceRange.Start differs if a proper block precedes it or not #708

Open
rcjsuen opened this issue Nov 6, 2024 · 2 comments
Open

Comments

@rcjsuen
Copy link

rcjsuen commented Nov 6, 2024

I have two different HCL blocks.

variable bad
{}
variable good {}
variable bad
{}

Although both of the bad blocks look similar, their Block.OpenBraceRange.Start value is quite different. The first one seems to believe it exists at the start of the file (essentially hcl.InitialPos) but the second one seems to have something more sane as a response.

{1 1 0}
{2 1 17}
package main

import (
	"fmt"

	"github.com/hashicorp/hcl/v2"
	"github.com/hashicorp/hcl/v2/hclsyntax"
)

func Parse(filename string, bytes []byte) *hclsyntax.Body {
	file, _ := hclsyntax.ParseConfig(bytes, filename, hcl.InitialPos)
	return file.Body.(*hclsyntax.Body)
}

func main() {
	body := Parse("", []byte("variable bad\n{}"))
	fmt.Println(body.Blocks[0].OpenBraceRange.Start)

	body = Parse("", []byte("variable good {}\nvariable bad\n{}"))
	fmt.Println(body.Blocks[1].OpenBraceRange.Start)
}
@rcjsuen
Copy link
Author

rcjsuen commented Nov 6, 2024

Actually now that I look at it some more it seems like both ranges are wrong. And from debugging I see all the ranges are identical to each other. :P

hcl/hclsyntax/parser.go

Lines 362 to 374 in d20d07f

return &Block{
Type: blockType,
Labels: labels,
Body: &Body{
SrcRange: ident.Range,
EndRange: ident.Range,
},
TypeRange: ident.Range,
LabelRanges: labelRanges,
OpenBraceRange: ident.Range, // placeholder
CloseBraceRange: ident.Range, // placeholder
}, diags

@apparentlymart
Copy link
Contributor

When there's no brace at the end of the header, the parser performs recovery and then returns a very stubby fake Block to try to at least give the caller something to analyze to find the block type and labels that were already successfully parsed.

As far as the parser is concerned, this block has no braces at all (the brace is on the next line, but the parser only uses one token of lookahead so it's currently "looking at" the newline) and so it's using the ident range as a placeholder so that at least there's something to return. Since the variable identifier is at the start of the file, it seems like this range is what this code was intended to return.

It could perhaps instead have returned the position of the newline token as a way to say where the brace ought to have been -- tok.Range, as the code is currently written -- but really anything it reports here is going to be some sort of placeholder approximation because the block isn't valid enough to parse properly.

p.recoverAfterBodyItem should hopefully have left the scanner's cursor just after the closing brace, since that recovery logic works by trying to find a balanced number of bracket-type tokens, so CloseBraceRange could also then potentially be populated with p.PrevRange() to make a best effort to mark where the closing brace appears to be. Again that can't guarantee to be 100% reliable because the recovery logic is really just a pile of heuristics, but I think it would at least typically be something more useful than marking a zero-length span at the block's identifier.

Unlike with #702, because this is error recovery behavior rather than ranges reported by a successful parse I don't think it's reasonable to expect it to return something sensible in all cases and so I dunno whether it's actually worth trying to improve this, but above are my ideas on how to improve it if there's a strong enough reason to justify doing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants