Skip to content

Masked Language Model

A model trained by randomly hiding some tokens in the input and predicting them from surrounding context. BERT is the most well-known example. This bidirectional training excels at understanding tasks like classification and entity recognition.

Related terms

Causal Language ModelTransformerNatural Language Understanding (NLU)
← Back to glossary