Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify CSS parser and lexer #432

Open
davesnx opened this issue Mar 7, 2024 · 3 comments
Open

Unify CSS parser and lexer #432

davesnx opened this issue Mar 7, 2024 · 3 comments
Labels
parser Parsing or lexing issues

Comments

@davesnx
Copy link
Owner

davesnx commented Mar 7, 2024

css_parser.mly parses stylesheets/selectors/declarations/properties and values

Parser.re currently parses (and type-checks) properties + values (there's code in Parser.re that might make it possible to parse stylesheets/selectors and everything but isn't used)

Parser.re uses value.rec ppx whichs generates the parser fns for each property, while css_parser.mly uses menhir. I'm more inclined on having a hand-made parser or push for menhir Incremental API in order to call the value.rec's parsers, but it's up to see if it's a good idea.

The driver calls one and the other depending at which stage of the parsing happens, in order to have success with #429 it's necessary to have more control over the parsing phase.

Resources

Mix Lexer and Parser like https://github.com/mnxn/eon and take a look at Sedlexing from here: https://github.com/FStarLang/FStar/pull/2203/files

@davesnx davesnx added the parser Parsing or lexing issues label Mar 7, 2024
@davesnx davesnx changed the title Unify CSS parsers Unify CSS parser and lexer Mar 13, 2024
@davesnx
Copy link
Owner Author

davesnx commented Mar 13, 2024

Currently we use 2 lexers that live inside Css_lexer, and run separate test suites: Tokenizer_test and css_lexer_test.

We need to unify those, expose a single API to tokenize a string of CSS and treat errors as an exception (Css_lexer.Error) (since menhir needs it), run a single test suite.

This was referenced Mar 13, 2024
@davesnx
Copy link
Owner Author

davesnx commented Jul 5, 2024

After the unification of lexers, we have only one. We still have 2 separate parsers with 2 separate techniques (menhir and rules with combinators and the ppx).

Would be nice to join them into the same package. There's a few benefits of having them togeter:

  • No more source_of_loc
  • Reuse locations, errors and machinery

@davesnx
Copy link
Owner Author

davesnx commented Jul 15, 2024

There's a bit of work left to unify the lexers which is the API, we have from_string and tokenize where one gets a recursive token structure, and the other a list of tokens. Both contain locations, it's a matter to only use one of them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser Parsing or lexing issues
Projects
None yet
Development

No branches or pull requests

1 participant