Skip to content

Transformer

A neural network architecture introduced in the 2017 paper 'Attention Is All You Need.' Transformers use self-attention mechanisms to process input data in parallel, making them highly efficient for language tasks. They are the foundation of models like GPT, BERT, and Claude.

Related terms

Large Language Model (LLM)Attention MechanismNeural Network
← Back to glossary