How do we want error handling to finally look like? #616
Replies: 7 comments 8 replies
-
I've so-far identified three distinct major paradigms of error handling when writing a parser.
These three are not a map to the whole "how we want error handling to finally look", and by illustration with BTest examples of what seems intuitively correct we will further this discussion. But &try is a HUGE advance and very welcome. That alone was enough to make it timely now to at least make sure we have the syntactic support that we desire, for the above three paradigms. |
Beta Was this translation helpful? Give feedback.
-
There's also a related question if we want lower-level exception primitives (throw/catch-style), even independent of parsing errors. |
Beta Was this translation helpful? Give feedback.
-
What is the intended supported syntax with &try foreach following a member field which is a vector? That those words can appear in either order? That the logical fallback position for that single &try is repositioned with each iteration? |
Beta Was this translation helpful? Give feedback.
-
Yes, order of attributes doesn't matter. The fallback position will be the following field (i.e., subsequent to the container). It might in principle work to associate the |
Beta Was this translation helpful? Give feedback.
-
Something like this syntax:
|
Beta Was this translation helpful? Give feedback.
-
The current &try/backtrack isn't quite the right mechanism for (3) IMO, or at least I think it should not be out primary mechanism there because I think we can do better (in the sense of: more automated, more powerful). I was trying to keep the discussion of recovery separate, but I'm realizing it's all pretty closely related, so that probably isn't very helpful. So, we definitely want to support what your example shows: if one element of an array fails to parse, skip ahead to the next one and continue. The old prototype supported that differently, the best summary is in the Spicy paper, see Section 2.6 here: http://www.icir.org/robin/papers/acsac16-spicy.pdf. The old tests show more on how to use this: https://github.com/rsmmr/hilti/tree/master/tests/spicy/synchronize. The |
Beta Was this translation helpful? Give feedback.
-
Sorry for joining this discussion so late. Since my response became pretty long here's a TLDR;:
I wonder if we keep the premise that backtracking is orthogonal to error handling, we might be able to make the later more powerful by extending the current hooks and allowing parsers to raise errors. E.g., given a unit type X = unit {
a: uint8;
b: uint8[] &size=2;
c: F[10]; // Unit (or with #604 field) type `F`.
on %done {};
on %error {}
}; at least the following seem hard:
Triggering errorsFor issue (1) we would need a construct to raise errors from scripts. Currently, the runtime can trigger parse errors and logic errors, and I do not think distinguishing them from an error recovery perspective is too useful (e.g., writing error recovery for integer overflows in addition to parse errors seems like a reasonable ask). Depending on which additional tools we'd provide to inspect error state from error handlers we might not need that distinction on the script-land triggering side either, and just raising a single error type might be good enough. So let's say we'd introduce a Handling errorsThe current error handling seems to be not granular enough. We allow Interestingly Spicy already provides other hook-like constructs, e.g., while-else, local i: uint8 = 0;
while (True) {
++i;
break;
} else {
print(i);
} or unit switch-if, type X = unit {
a: uint8;
switch (self.a) {
0 -> b1: uint8[1],
1 -> b1: uint8[2],
} if (self.a < 2);
}; While-else acts like a special case of a It seems these examples would also work if we were to allow hooks on other constructs, e.g., unit fields, blocks, and functions, e.g., above examples: // While-else.
local i: uint8 = 0;
while (True) {
++i;
break;
} on %done {
print(i);
}
// or even:
while (True) {
++i;
break;
} on %init {
local i: uint8 = 0;
} on %done {
print(i);
} // Unit-switch. Two approaches with different semantics.
// Enforce a global invariant on `X.a`.
type X = unit {
a: uint8 on %done { if ($$ >= 2) raise("impossible value %d for 'a'" % $$); }
// Even today with less explicit syntax:
//
// a: uint8 { if ($$ >= 2) ... };
switch (self.a) {
0 -> b1: uint8[1],
1 -> b1: uint8[2],
}
};
// Conditionally enable unit switch block.
type X = unit {
a: uint8;
switch (self.parse_b) {
0 -> b1: uint8[1],
1 -> b1: uint8[2],
} on %init {
if (self.a >= 2)
continue; // IMAGINARY: Skip preceeding block.
}
}; // Even C++-style function-try-block.
function foo(x: uint8) : uint8 {
return x*x;
} on %error {
print("failure to foo x: %s" % $error); // See below for `$error`.
return 0; // Required return value for finalizing hooks like `%done` and `%error`.
// Alternatively: raising error requires no return. While we would only need field-level Accessing error stateWith these imaginary extensions we could rewrite the original example, type F;
type X = unit {
a: uint8 on %done {
if ($$ % 2) raise("'a' must be even");
}
b: uint8[] &size=2;
c: F[10] on %error {
// IMAGINARY SYNTAX `$error`.
print("recovering after failure to parse entry %d of 'c': %s" % (|$$|, $error));
// Prevent propagation of error.
};
on %done {
local sum: uint64 = 0;
for (b in self.b) { sum += b; }
if (self.a != sum) { raise("sum invariant violated: %d vs %d" % (b, sum)); }
};
on %error {
// Not stricyly needed anymore, but usable to e.g., decorate upstream errors.
// IMAGINARY SYNTAX `$error`.
raise("failure to parse 'X': %s" % $error);
}
}; We seem to be able to validate preconditions and handle errors locally, and trigger errors from scripts. What is still missing is a way to access the current error in an error handler for which I wrote a magic imaginary accessors Relation to synchronizationIssue #23 is about adding synchronization from the original prototype. There this was achieved by adding unit-scope attributes to indicate what content to skip, e.g., from the original paper
This works, but I wonder whether instead allowing these constructs on the the block scope in addition to unit scope would make the feature more powerful, e.g.,
I could imagine that block-scope synchronization primitives would also be useful in e.g., the imaginary field hooks I mentioned above. It seems e.g., possible to implement some of the features of backtracing #606 with that. |
Beta Was this translation helpful? Give feedback.
-
We have a PR open that will add manual backtracking through
&try
/backtrack()
. That's a straight port of a feature from the old Spicy research prototype, but it's certainly not the last word on error handling. Curious to hear thoughts on use cases we should support, and proposals for syntax to do so.(Note that there's also a separate topic of automatic error recovery. I'm planing to eventually port the approach of the research prototype over for that as well, which performs automatic re-synchronization at some later point in the input stream where we can continue parsing after an earlier error; this is tracked by #23).
Beta Was this translation helpful? Give feedback.
All reactions