Chapter 03.02: The Encoder

The Encoder in a transformer model is responsible for processing the input sequence and generating contextualized representations of each token, capturing both local and global dependencies within the sequence. It achieves this by employing self-attention mechanisms, which allow each token to attend to all other tokens in the input sequence, enabling the model to capture relationships and dependencies between tokens regardless of their positions. Additionally, the encoder includes position-wise feedforward networks to further refine the representations and incorporate positional information.

Lecture Slides

« Chapter 03.01: A universal deep learning architecture
Chapter 03.03: The Decoder »