-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Returning progress when parsing fails #41
Comments
If we can come up with something that is decently ergonomic, then I wouldn't mind the change. All my libs will not have a 1.0 release before zig 1.0 so let's just break things while we can. |
I think the change would require moving out of pub fn Result(comptime T: type) type {
return struct {
pub const Value = union(enum) {
Ok: T,
Err: Error, // using the pub const Error
};
value: Value,
rest: []const u8 = "",
};
} But that would mean rewriting all combinators and their respective functions |
maybe it would be even better to support in what state the parser left, as there are some recovery strategies that can be implemented, to report as many errors as possible for example: char* str = "aaaaa
"; is invalid but most parsers would assume that the string was just improperly terminated, although the input is invalid, the parser would continue assuming that the string is valid: char* str = "aaaaa"
";"; erroring with unterminated string x2 and missing semicolon x2 or even some smarter error recovery strategy. Like having a special rule for multiline strings that fails as it is not valid in the C grammar, but continuing with parsing as the parser can still find more errors. Failing at the first error and not knowing what the error is: "ParserError", is not sufficient for recovery. If there could be either a "panic mode" (discard characters until a "safe character" is found, like }, end of a block) or fail with continuation, so the implementor could modify the state to get it into a safe state. For example if string literal fails, continue by inserting a ghost ", then continue with parsing. |
I have starting to test the "continuation" tactic. The pub fn Result(comptime T: type) type {
return struct {
pub const Value = union(enum) {
Ok: T,
InternalError: Error, // Parser failed irrecoverably
ParserError: Parser(T), // Parser failed at this state
};
value: Value,
rest: []const u8 = "",
};
} and for example the pub fn string(str: []const u8) Parser(void) {
return struct {
fn func(_: mem.Allocator, s: []const u8) Void {
const diff = mem.indexOfDiff(u8, s, str); // find the first characters where input differs
if (diff) |diff_index| { // if there is one
if (diff_index == str.len) // if the first difference is the ending of str, the parse was successful
return Void{ .value = .{ .Ok = {} }, .rest = s[str.len..] };
// else create a new string, that expects the rest of the string to be parsed
// for example: if s = "strang" and str = "string"
// => string("ing") and rest = "ang"
const resume_string = string(str[diff_index..]);
return Void{ .value = .{ .ParserError = resume_string }, .rest = s[diff_index..] };
} else return Void{ .value = .{ .Ok = {} }, .rest = s[str.len..] };
}
}.func;
} @Hejsil please let me know what you think! Does it seem interesting to you? Should I continue with the implementation?
|
Yes correct, this will not work. I'm also not sure I understand why we would want |
Okay maybe I haven't thought this through properly, sorry. I have been thinking about this for a few hours, and parser combinators aren't really suitable for what I had in mind. It would work for LL(1) parsers, but you cannot really inspect combinators like you can inspect Finite State Machines with explicit transition tables. The I'm really sorry that I have bothered you with this, I should've thought about it much longer before posting this. Maybe the original |
Yea, I think there is a usecase for the Ok/Err thing. Being able to see how far the parser got furthest can be quite useful for simple error reporting, but for things more complicated, I think that is outside this libraries scope. |
Okay, I'll try to get it done soon'ish and make a PR!
You are right, I think there could be some combinators for opt and error message, or combine opt with map, but otherwise the library is already featurefull enough. Once again I'm sorry for all the other stuff I suggested, it was dumb. |
There is a use case for this library, where the parser should report, where it failed. Since that would allow for some "graceful" failing. Instead of crashing, there is some merit in reporting where. It would suffice to report the position/rest, as everything else can be calculated if that happens, but now I think there is no way to report progress.
This would essentially mean moving the Error inside the ParserResult, which isn't ideal and is an architecture change. I understand if such modification is out of scope, and I need to implement it myself.
The text was updated successfully, but these errors were encountered: