Implement an attention model that takes an image of a PDF math formula, and outputs the characters of the LaTeX source that generates the formula.
This is a tensorflow implementation of the HarvardNLP paper: What You Get Is What You See: A Visual Markup Decompiler.
The model graphic is here:
An example input is a rendered LaTeX formula:
The goal is to infer the LaTeX formula that can render such an image:
d s _ { 1 1 } ^ { 2 } = d x ^ { + } d x ^ { - } + l _ { p } ^ { 9 } \frac { p _ { - } } { r ^ { 7 } } \delta ( x ^ { - } ) d x ^ { - } d x ^ { - } + d x _ { 1 } ^ { 2 } + \; \cdots \; + d x _ { 9 } ^ { 2 }
Most of the code is written in tensorflow, with Python for preprocessing.
The proprocessing for this dataset is exactly reproduced as the original torch implementation by the HarvardNLP group
Python
- Pillow
- numpy
Optional: We use Node.js and KaTeX for preprocessing Installation
pdflatex Installaton
Pdflatex is used for rendering LaTex during evaluation.
ImageMagick convert Installation
Convert is used for rending LaTex during evaluation.
- linux
sudo apt install imagemagick
- linux setup webpage
- Mac
brew install imagemagick
Webkit2png Installation
Webkit2png is used for rendering HTML during evaluation.
Code directionart:
cd data
For more details, see the readme.md in this folder
Once the dataset is ready, saved them as the npy format:
train_buckets.npy, valid_buckets.npy, test_buckets.npy can be generated using the **build_imglatex_data.py** script
python3 train_model.py
Default hyperparameters used:
- BATCH_SIZE = 32
- EMB_DIM = 80
- ENC_DIM = 256
- DEC_DIM = ENC_DIM*2
- D = 512 (channels in feature grid)
- V=len(vocab)+3 = (vocab size)+3
- NB_EPOCHS = 50
- H = 20 (Maximum height of feature grid)
- W = 50 (Maximum width of feature grid)
The train NLL drops to 0.08 after 18 epochs of training on 24GB Nvidia M40 GPU.
- python3 predict_to_img.py
attention.py scores the train set and validation set after each epoch (measures mean train NLL, perplexity)
- Printed style https://zenodo.org/record/56198#.XA4GjfYzZZj
- handwriting http://lstm.seas.harvard.edu/latex/data/
backup_predict_to_img.py
原始仓库网络结构测试程序
- OpenAI’s Requests For Research ProblemOpen AI-question source
- Seq2Seq for LaTeX generation
- Original model repo-网络模型TF
- Another model repo--网络模型TF
- 知乎解释
- Dataset ori repo-数据集制作