HuggingFace's Transformers: State-of-the-art Natural Language Processing, Wolf et al., 2019

Paper, Website, Tags: #nlp, #frameworks

With Transformers, a paradigm shift came in NLP with the starting point for training a model on a downstream task moving from a blank specific model to a general-purpose pretrained architecture. But creating these general-purpose architectures is expensive and time-consuming process.

We present HuggingFace's Transformers library, making these developments available to the community by gathering state-of-the-art general-purpose pretrained models under a unified API together with an ecosystem of libraries, examples, tutorials and scripts targeting many downstream tasks.

The API follows a classic NLP pipeline: setting up configuration, processing data with a tokenizer and encoder, and using a model for either training or inference.

Library design

Steps: defining the model architecture, processing the text data and training the model and performing inference in production.

1. Configuration

A configuration class instance stores the model and tokenizer parameters (vocabulary size, hidden dimensions, dropout rate, etc). Configurations are agnostic to the deep learning framework used
Tokenizers: A tokenizer class is available for each model, it stores the vocabulary token-to-index map for the corresponding model and handles the encoding and decoding of input sequences according to the model's tokenization-specific process.
Model: base class implementing the model's computational graph from encoding through the series of self-attention layers and up to the last hidden states.
Auto classes: a set of Auto classes provides a unified API that enables very fast switching between different models/configs/tokenizers.

2. Training

Optimizer: we provide a few optimization utilities as subclasses of Pytorch 'torch.optim.Optimizer'
Scheduler: additional learning rate schedulers are also provided as subclasses of Pytorch 'torch.optim.lr_scheduler.LambdaLR'

Experiments with Transformers

Language understanding benchmarks
Language model fine-tuning
Ecosystems
- Write with Transformer: an interactive interface that leverages the generative capabilities of pretrained architectures like GPT, GPT2 and XLNet to suggest text like an auto-completion plugin.
- Conversational AI
- Using in production: all the models in the library are compatible with TorchScript, an intermediate representation of a Pytorch model that can be run in Python or C++

Architectures

List of architectures for which reference implementations and pretrained weights are currently provided by Transformers. These are generative models (GPT, GPT-2, Transformer-XL, XLNet, XLM) and language understanding (BERT, DistilBERT, RoBERTa, XLM)

BERT
RoBERTa
DistilBERT
GPT
Transformer-XL
XLNet
XLM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1910.03771.md

1910.03771.md

HuggingFace's Transformers: State-of-the-art Natural Language Processing, Wolf et al., 2019

Paper, Website, Tags: #nlp, #frameworks

Library design

1. Configuration

2. Training

Experiments with Transformers

Architectures

Files

1910.03771.md

Latest commit

History

1910.03771.md

File metadata and controls

HuggingFace's Transformers: State-of-the-art Natural Language Processing, Wolf et al., 2019

Paper, Website, Tags: #nlp, #frameworks

Library design

1. Configuration

2. Training

Experiments with Transformers

Architectures