For NLP classification the current state of the art approach is Universal Language Model Fine-tuning (ULMFiT). ULMFiT is an effective transfer learning method that can be applied to any task in NLP, but at this stage we have only studied its use in classication tasks. The approach is described and analyzed in the Universal Language Model Fine-tuning for Text Classification paper by fast.ai’s Jeremy Howard and Sebastian Ruder from the NUI Galway Insight Centre.
To learn to use ULMFiT and access the open source code we have provided, see the following resources:
- ULMFiT is discussed in depth in lesson 10 of fast.ai’s Cutting Edge Deep Learning for Coders. A gentler introduction is available in lesson 4 of Practical Deep Learning for Coders
- The fastai library provides modules necessary to train and use ULMFiT models. In particular, you will want to use
fastai.text
andfastai.lm_rnn
- The scripts used for the ULMFiT paper are available in the imdb_scripts folder in the fastai repository.
- The pre-trained Wikitext 103 model and vocab are available here
- The paper and code are being discussed in the fast.ai discussion forums. Feel free to join the discussion!