-
Notifications
You must be signed in to change notification settings - Fork 5
/
README
221 lines (175 loc) · 8.58 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
Demystify
A Magic: The Gathering Parser
1) ABOUT
Demystify is an attempt to make it possible for a computer to understand what
general Magic: The Gathering cards do. It is currently written in a combination
of ANTLR 3.5 and Python 3.2.
2) INSTALLATION
Demystify depends on the following libraries and/or programs, which you will
need to have installed in order to build the parser and run Demystify.
See INSTALL for more specific instructions.
- Java 1.6.0 (or later)
- ANTLR 3.5 (or later 3.5 release)
- ANTLR 3 Python3 runtime [available at github.com/antlr/antlr3]
- jpackage-utils 1.7.5 (or later)
- Python 3.2
- python-progressbar2
3) BUILDING
To build the parser generator, you need to have the macro.g and Words.g grammar
files up-to-date. If they don't exist, or demystify/keywords.py has changed,
run
$ python3 demystify/keywords.py
to regenerate them.
Then, you need to run antlr3 on the grammar. If you've followed the instructions
in INSTALL and gotten yourself an antlr3 script, all you need to do is:
$ cd demystify/grammar/
$ antlr3 Demystify.g
If not, your commandline will look something like this (supposing your classpath
is properly set):
$ cd demystify/grammar/
$ java org.antlr.Tool Demystify.g
(Note that cd instruction; if you "antlr3 demystify/grammar/Demystify.g"
instead, you'll get the .py output files where they belong, but the .tokens
files are put in your working directory.)
This takes a little while.
If it worked (and it worked if all the output you received were of the form
"warning(138): ... no start rule ...", and not "error(12345)" etc.),
demystify/grammar/ should now contain a series of Demystify*.py
files. These will be used by the main demystify.py script.
If it doesn't work, and you're sure you installed antlr3 and java correctly,
check that you followed instructions in INSTALL again. If antlr3 is actually
giving grammar errors, please check the Issues tab in github for the issue
or file a new bug.
4) RUNNING
The main entry point is the demystify.py script in the demystify/ folder.
It currently offers two modes of running:
load
Loads the card data from the Scryfall data file in demystify/data/cache/
(may download it from Scryfall if necessary), and performs preprocessing steps
necessary before any lexing and parsing is done. With -i, opens an interactive
prompt afterward, at which functions in demystify.py and card.py can be called.
This is useful for actually invoking the parser, as well as doing special card
searches using the utility functions in card.py.
test
Runs the tests (or a specific test) in demystify/tests/.
These can be run from within demystify with:
$ python3 demystify.py load -i
Add -h or --help for more information:
$ python3 demystify.py -h
$ python3 demystify.py test -h
5) LEXING AND PARSING
The intention of this project is to eventually generate full parse trees for
every line of text on every card. However, as this is a low priority personal
project, Demystify is far from this goal. At the moment, all that you can do is
test the lexer or parser.
First, load the environment.
$ python3 demystify.py load -i
Processing cards for card names...
Innocent Blood [##########################] 14715 of 14715 Time: 0:00:01
>>> cards = card.get_cards()
get_cards() returns all the cards loaded. If you like, you can select a smaller
set of cards via card.get_card_set(), perform a search (note that searches
return matching text rather than matching cards), or get an individual card.
>>> karn = card.get_card('Karn, Silver Golem')
The lexer is essentially done (modulo any new words or symbols in sets not yet
added to the project), but it can still be tested, with test_lex or lex_card.
lex_card will print the preprocessed text followed by a table of lex symbols.
(You can also get at the preprocessed text by simply printing the card object.)
>>> lex_card(karn)
whenever SELF blocks or becomes blocked, it gets -4/+4 until end of turn.
{1}: target non-creature artifact becomes an artifact creature with...
1 0 0 whenever WHEN
1 9 2 SELF SELF
1 14 4 blocks BLOCK
1 21 6 or OR
1 24 8 becomes BECOME
1 32 10 blocked BLOCKED
1 39 11 , COMMA
...
1 72 30 . PERIOD
2 0 32 1 MANA_SYM
2 3 33 : COLON
...
Note that test_lex is very slow, but very effective at finding new vocabulary.
The parse_all function is meant to eventually parse rules text, but at the
moment it parses only mana cost and type lines. It takes a list of cards
to parse, and adds the parse tree to each card individually.
>>> parse_all(cards)
Innocent Blood [##########################] 14715 of 14715 Time: 0:00:25
0 total errors.
>>> karn.parsed_cost
<antlr3.tree.CommonTree instance (COST (MANA 5))>
>>> karn.parsed_typeline
<antlr3.tree.CommonTree instance (TYPELINE (SUPERTYPES legendary)
(TYPES artifact creature) (SUBTYPES golem))>
The other parse functions use a regex to narrow down the text they attempt
to parse. You may still pass all the cards.
>>> parse_triggers(cards)
Whippoorwill [############################] 3702 of 3702 Time: 0:00:02
181 total errors.
68 unique cases missing.
>>> parse_keyword_lines(cards)
Wasp Lancer [############################] 5085 of 5085 Time: 0:00:02
1 total errors.
1 unique cases missing.
>>> parse_ability_costs(cards)
Helm of Obedienc [############################] 4369 of 4369 Time: 0:00:02
2 total errors.
2 unique cases missing.
Each parse function reports how many times it couldn't parse a line of text,
and (attempts to) group them together into cases. These are all logged to the
LOG file, which is useful for development.
The parse results themselves are again attached to the cards, but as lists
of strings rather than antlr tree objects.
>>> karn.parsed_costs
['(COST (MANA 1))']
>>> karn.parsed_triggers
['(TRIGGER (EVENT (SUBSET SELF) (OR (BECOME BLOCKING) (BECOME blocked))))']
>>> akroma = card.get_card('Akroma, Angel of Wrath')
>>> akroma.parsed_keywords
['(KEYWORDS flying FIRST_STRIKE vigilance trample haste
(protection (PROPERTIES black) (PROPERTIES red)))']
You can test individual rules with arbitrary text by calling test_parse, or by
creating a parsing unittest in test/.
6) MISCELLANEOUS
Dependency visualization:
The deps/deps.py script reads the grammar files and outputs to stdout a
dot format graph file, which describes how parser rules in the grammar call
other parser rules, as well as how the grammar files reference other grammar
files. You'll want to use graphviz's dot program to visualize this, e.g. with:
$ python3 deps/deps.py | dot -Tpng -o gdeps.png
graphviz can be installed via your package management tool, or downloaded from
http://www.graphviz.org.
7) CONTRIBUTIONS
Are welcome! Please use github forks and pull requests. Bugs without fixes can
be reported as well.
http://github.com/Zannick/demystify/
8) LICENSE AND DISCLAIMER
The files in this repository fall into four categories:
A) antlr3/antlr3
B) Python code in demystify/ and ANTLRv3 code in demystify/grammar/
C) deps/deps.py
D) Card data in demystify/data/ and demystify/tests/
(A) is based on the antlr3 script that was installed on one of my machines
when I installed ANTLRv3 via yum. It is therefore (to the best of my
knowledge) covered until ANTLRv3's license (BSD),
which is included as antlr3/LICENSE.
(B) is the Demystify program itself, and the core of this repository. It
is licensed under the Lesser GNU Public License version 3. See COPYING
for the GPL v3 and COPYING.LESSER for the LGPL v3.
(C) is also licensed under the LGPL v3 but it is not a part of Demystify
(though it runs by reading parts of Demystify) which is why I've set it apart.
(D) consists of Magic: The Gathering card data, in the form of full card
records, slightly modified to fix errors (data/), or in the form of test cases,
which may include actual card rules or possible card rules. All card
information is copyrighted by Wizards of the Coast.
Demystify is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
Demystify is not produced, endorsed, supported,
or affiliated with Wizards of the Coast.
9) THANKS
Thanks go to Terence Parr and the ANTLR project (without which this project
would be much more difficult), and to Wizards of the Coast (for making
Magic: The Gathering, without which this project would not exist).