Chapter 07.03: GPT-3 (2020) & X-shot learning

In this chapter, we’ll explore GPT-3 [1]. GPT-3 builds on the successes of its predecessors, boasting a massive architecture and extensive pre-training on diverse text data. Unlike previous models, GPT-3 introduces a few-shot learning approach, allowing it to perform tasks with minimal task-specific training data. With its remarkable scale and versatility, GPT-3 represents a significant advancement in natural language processing, showcasing the potential of large-scale transformer architectures in various applications.

Lecture Slides

References

[1] Brown et al., 2020

« Chapter 07.02: GPT-2 (2019)
Chapter 07.04: Tasks & Performance »