One core idea of prompt-learning is to use additional context with masked tokens to imitate the pre-training objectives of PLMs and better stimulate these models. Hence, the choice of PLMs is crucial to the whole pipeline of prompt-learning.
PLMs could be roughly divided into three groups according to their pre-training objectives.
- masked language modeling (MLM) -> BERT, RoBERTa
- autoregressive-style language model -> GPT3
- sequence-to-sequence -> T5, MASS, BART
OpenPrompt
OpenPrompt supports a combination of tasks (classification and generation), PLMs (MLM, LM and Seq2Seq), and prompt modules (different templates and verbalizers).
- templating strategy
- initialization strategy
- verbalizing strategy
A Template class is used to define or generate textual or softencoding templates to wrap the original input. A template normally contains contextual tokens (textual or soft) and masked tokens. In OpenPrompt, all the templates are inherited from a common base class with universal attributes and abstract methods.
A Verbalizer projects the classification labels to words in the vocabulary. It extracts the logits of label words and integrate the logits of label words to the corresponding class, thereby responsible for the loss calculation.
Prompt Model is responsible for the training and inference process. A Prompt Model object to be responsible for training and inference, which contains a PLM, a Template object, and a Verbalizer object (optional).
paper https://arxiv.org/pdf/2111.01998.pdf
implementation https://github.com/thunlp/OpenPrompt