Look at the readme found here
https://github.com/UoA-CS760/autocomplete-predictor/blob/master/models/seq/lstm/README.md
Raw data input is AST from 150k python https://www.sri.inf.ethz.ch/py150
this data needs to be preprocessed
convert the AST into another form of ast, with more nodes (splitting certain values
one is path based, another is traversal based
- Run Generate new trees.py
- Run Generate data traversal.py
- Run Generate vocab.py