Transformer

A neural network architecture introduced in the 2017 paper 'Attention Is All You Need.' Transformers use self-attention mechanisms to process input data in parallel, making them highly efficient for language tasks. They are the foundation of models like GPT, BERT, and Claude.