This web app, built by the Hugging Face team, is the official demo of the
🤗/transformers
repository's text generation capabilities.
Models
🦄 GPT-2
The almighty king of text generation, GPT-2 comes in four available sizes, only three of which have been publicly made available. Feared for its fake news generation capabilities,
it currently stands as the most syntactically coherent model. A direct successor to the original GPT, it reinforces the already established pre-training/fine-tuning killer duo.
From the paper: Language Models are Unsupervised Multitask Learners by Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever.
💯 XLNet
Overcoming the unidirectional limit while maintaining an independent masking algorithm based on permutation, XLNet improves upon the state-of-the-art autoregressive model that is TransformerXL. Using a bidirectional context while keeping its autoregressive approach, this model outperforms BERT on 20 tasks while keeping an impressive generative coherence.
From the paper: XLNet: Generalized Autoregressive Pretraining for Language Understanding, by Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov and Quoc V. Le.
☠️ GPT
Released by OpenAI, this seminal architecture has shown that large gains on several NLP tasks can be achieved by generative pre-training a language model
on unlabeled text before fine-tuning it on a downstream task.
From the paper: Improving Language Understanding by Generative Pre-Training, by Alec Radford, Karthik Naraimhan, Tim Salimans and Ilya Sutskever.
🐎 DistilGPT-2
The student of the now ubiquitous GPT-2 does not come short of its teacher’s expectations.
Obtained by distillation, DistilGPT-2 weighs 37% less, and is twice as fast as its OpenAI counterpart, while keeping the same generative power.
Runs smoothly on an iPhone 7. The dawn of lightweight generative
transformers?
🤓 Arxiv-NLP
Built on the OpenAI GPT-2 model, the Hugging Face team has fine-tuned the small version on a tiny dataset (60MB of text) of Arxiv papers.
The targeted subject is Natural Language Processing, resulting in a very Linguistics/Deep Learning oriented generation.
Do you want to contribute or suggest a new model checkpoint? Open an issue on
🤗/transformers 🔥.
“It is to writing what calculators are to calculus.”