Last Pytorch 0.4.0 version
New in this release:
Multi-GPU based on torch distributed (acknowledgement to Fairseq)
Change from Epoch to Step (see opts.py)
Average Attention Network (AAN) for the Transformer (thanks @francoishernandez )
New fast beam search (see -fast in translate.py) (thanks @guillaumekln)
Sparse attention / sparsemax (thanks to @bpopeters)
and many fixes.
This is the last version with pytorch 0.4.0
Next 0.4.1 pytorch version includes breakings changes.