Diagnostics are too chatty #442

davidanthoff · 2024-06-26T18:05:35Z

I've started implementing a linter that will eventually power the LS. For now all it does is use the diagnostic messages from JuliaSyntax. https://github.com/davidanthoff/StringBuilders.jl/pull/96/files is an example. There is one syntax error, a missing ) on line 14 and the error message for that is nice.

But then all the other error messages after it I feel ideally would not appear at all. In my mind once the parser recognizes that the error on line 14, it should continue parsing pretending that the error was fixed and then not report all these knock-off errors further down in the file.

This is actually probably a reason to not surface the diagnostics in the VS Code extension yet. CSTParser doesn't have a nice error message, but it just stops processing the file after this first error, which at the end of the day is probably a better user experience because the error list is not populated by lots of false-positive like things.

FYI @pfitzseb

The text was updated successfully, but these errors were encountered:

KristofferC · 2024-06-26T19:32:34Z

You can chose what to propagate, or? Analogous to #344

davidanthoff · 2024-06-26T21:43:38Z

That is true, we can easily go back to the CSTParser situation by just reporting the first error. Having said that, it would be nice if the error recovery here would be better :)

davidanthoff · 2024-06-29T13:48:18Z

dotnet/roslyn#16151 (comment) is interesting. I would imagine that in the case I linked to, we detect that a closing ) is missing, and then bubble up one level and continue parsing normally, as if that ) was there, right?

davidanthoff · 2024-07-17T22:26:20Z

I started to surface JuliaSyntax.jl errors and warnings in the VS Code extension now (and also in a new GitHub lint action). And I'd really like to not just show the first diagnostics. If there are multiple errors in a file, it is really useful to show them all, so I think for now I'm going to err on the side of showing too many errors, hoping that we can gradually improve the error recovery here in JuliaSyntax.

c42f · 2024-07-18T01:21:37Z

Yeah, we don't recover well from this kind of thing. Hence #344 :-(

Recovery is a whole thing I haven't worked on. Partly because I don't believe it's possible to do it well with a big hand-written pile of heuristics. At least, not without a heroic amount of engineering effort.

I'd like to do it in a data-driven way instead because I think that's a good long term solution. ("Data driven" == a user-contributed database of common errors and their fixes along with human-readable explanations. Then use that in conjunction with a pluggable recovery system - initially perhaps a hand-written pattern matcher, but I think a learned model is the ultimate solution to this.)

davidanthoff · 2024-07-18T21:19:21Z

Take everything that follows with a grain of salt because I'm still trying to understand JuliaSyntax and might mix things up :) But here we go.

If I understand what is happening in this case of a missing closing parenthesis for a function, then

JuliaSyntax.jl/src/parser.jl

Line 144 in f99b76f

function bump_closing_token(ps, closing_kind, alternative_closer_hint=nothing)

handles that. And the logic seems to be that it if there isn't a ) where it expects that, then it just keeps looking for the next ) and marks everything in between as an error node. Is that right?

If that is so, then the issue here doesn't seem to be a pattern matching system that sits on top of the nodes, but instead I think the actual nodes that are generated would have to look differently and ideally follow the strategy that is described in the C#/Roslyn issue I linked to above. If they don't find a closing parenthesis, the first thing they try is see whether the next token can be successfully parsed by the "context" above the current one, and if so continue in that way. So here the context that is missing the ) is the function arguments, and the context above would be the statements in the function body, and so Roslyn would just check whether the next thing that isn't the closing ) is a valid statement (presumably), and if so, contineu with that.

I'm not sure how that would be implemented here, though, as I looks to me as if inside bump_closing_token there is no info what the next higher context would be, so it seems hard to test whether the next token would be valid there, right?

c42f · 2024-07-19T03:22:44Z

And the logic seems to be that it if there isn't a ) where it expects that, then it just keeps looking for the next ) and marks everything in between as an error node. Is that right?

Yes, it's pretty much the simplest possible recovery algorithm. It's not good 😆

I'm not sure how that would be implemented here, though, as I looks to me as if inside bump_closing_token there is no info what the next higher context would be, so it seems hard to test whether the next token would be valid there, right?

Right. My understanding is that we'd have to manually push onto a stack of parent contexts as we recursively call the parsing functions. Essentially we're keeping a "shadow stack" which mirrors the actual program stack but is accessible at the source level.

It's not the worst thing ever and maybe could even be automated with a macro to wrap a push/pop around every parse_* function call body. But it's also not lovely. IIRC the rust-analyzer parser also does this kind of thing. And it seems like a reasonable amount of engineering effort to get something a lot better than what we have now.

I still don't think that kind of heuristic can ever be "really great", but if it's fairly easy it could be a worthwhile win.

c42f · 2024-07-19T03:29:29Z

dotnet/roslyn#16151 (comment) is interesting. I would imagine that in the case I linked to, we detect that a closing ) is missing, and then bubble up one level and continue parsing normally, as if that ) was there, right?

Btw if someone has the energy to make an experimental prototype of this and demonstrate what it could give us - and what the cost is in code complexity - that would be amazing. Doesn't have to be fancy - just a demo to give a taste.

I'm really busy right now so I can't write the code but I could look at a prototype.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diagnostics are too chatty #442

Diagnostics are too chatty #442

davidanthoff commented Jun 26, 2024 •

edited

Loading

KristofferC commented Jun 26, 2024

davidanthoff commented Jun 26, 2024

davidanthoff commented Jun 29, 2024

davidanthoff commented Jul 17, 2024

c42f commented Jul 18, 2024 •

edited

Loading

davidanthoff commented Jul 18, 2024

c42f commented Jul 19, 2024

c42f commented Jul 19, 2024 •

edited

Loading

Diagnostics are too chatty #442

Diagnostics are too chatty #442

Comments

davidanthoff commented Jun 26, 2024 • edited Loading

KristofferC commented Jun 26, 2024

davidanthoff commented Jun 26, 2024

davidanthoff commented Jun 29, 2024

davidanthoff commented Jul 17, 2024

c42f commented Jul 18, 2024 • edited Loading

davidanthoff commented Jul 18, 2024

c42f commented Jul 19, 2024

c42f commented Jul 19, 2024 • edited Loading

davidanthoff commented Jun 26, 2024 •

edited

Loading

c42f commented Jul 18, 2024 •

edited

Loading

c42f commented Jul 19, 2024 •

edited

Loading