-
-
Notifications
You must be signed in to change notification settings - Fork 118
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add debug support and documentation for internal graph output
- Loading branch information
Showing
2 changed files
with
86 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
# Debugging | ||
|
||
Gain deeper insights into your code's behavior with this debugging section. | ||
|
||
## Visualizing Logos Graph | ||
|
||
Logos works by creating a graph that gets derived from the tokens that you defined. This graph describes how the lexer moves through different states when processing input. | ||
|
||
Hence, it may be beneficial during debugging to be able to visualize this graph, to understand how Logos will match the various tokens. | ||
|
||
If we take this example: | ||
```rust,no_run,noplayground | ||
use logos::Logos; | ||
#[derive(Debug, Logos, PartialEq)] | ||
enum Token { | ||
// Tokens can be literal strings, of any length. | ||
#[token("fast")] | ||
Fast, | ||
#[token(".")] | ||
Period, | ||
// Or regular expressions. | ||
#[regex("[a-zA-Z]+")] | ||
Text, | ||
} | ||
fn main() { | ||
let input = "Create ridiculously fast Lexers."; | ||
let mut lexer = Token::lexer(input); | ||
while let Some(token) = lexer.next() { | ||
println!("{:?}", token); | ||
} | ||
} | ||
``` | ||
|
||
Logos actually constructs a graph that contains the logic for matching tokens: | ||
``` | ||
graph = { | ||
1: ::Fast, | ||
2: ::Period, | ||
3: ::Text, | ||
4: { | ||
[A-Z] ⇒ 4, | ||
[a-z] ⇒ 4, | ||
_ ⇒ 3, | ||
}, | ||
7: [ | ||
ast ⇒ 8, | ||
_ ⇒ 4*, | ||
], | ||
8: { | ||
[A-Z] ⇒ 4, | ||
[a-z] ⇒ 4, | ||
_ ⇒ 1, | ||
}, | ||
9: { | ||
. ⇒ 2, | ||
[A-Z] ⇒ 4, | ||
[a-e] ⇒ 4, | ||
f ⇒ 7, | ||
[g-z] ⇒ 4, | ||
}, | ||
} | ||
``` | ||
This graph can help us understand how our patterns are matched, and maybe understand why we have a bug at some point. | ||
|
||
Let's get started by trying to understand how Logos is matching the `.` character, which we've tokenized as `Token::Period`. | ||
|
||
We can begin our search by looking at number `9` for the character `.`. We can see that if Logos matches a `.` it will jump `=>` to number `2`. We can then follow that by looking at `2` which resolves to our `::Period` token. | ||
|
||
Logos will then continue to look for any matches past our `.` character. This is required in case there is potential continuation after the `.` character. Although, in the _input_ we provided there are no any additional characters, since it is the end of our input. | ||
|
||
We also can try to identify how the token `fast` works by looking at `9`, first, and seeing that `f` will cause Logos to jump to `7`. This will then resolve the last letters of our word _fast_ by matching `ast` which jumps to `8`. Since our provided _input_ to the lexer does not include alphabetic characters after the word "fast", but rather a whitespace, the token `::Fast` will be recognized. Then, the graph will look for further potential continuation (here, `[g-z] => 4`) | ||
|
||
### Enabling | ||
|
||
To enable this debugging output you can use the `debug` feature. | ||
|
||
In your `Cargo.toml` you can | ||
``` | ||
[dependencies] | ||
logos = { version = "1.2.3", features = ["debug"] } | ||
``` |