Chapters
-
Chapter 0: Machine Learning Basics
This chapter introduces the basic concepts of Machine Learning. We therefore rely the excellent material from the I2ML Course which already comes with videos and has been taught LMU numerous times already. The focus of these chapters in on introducing supervised learning, explaining the difference between regression and classification, showing how to evaluate and compare Machine Learning models and formalizing the concept of learning in general. When taking our DL4NLP course, you do not necessarily have to re-watch all of the videos if you already have proficient knowledge in this area.
-
Chapter 1: Introduction to the course
In this chapter, you will dive into the fundamental principles of Deep Learning for Natural Language Processing (NLP). Explore key concepts including learning paradigms, various tasks within NLP, the neural probabilistic language model, and the significance of embeddings.
-
Chapter 2: Deep Learning Basics
In this chapter we explore fundamental concepts like Recurrent Neural Networks (RNNs), the attention mechanism, ELMo embeddings, and tokenization. Each concept serves as a building block in understanding how neural networks can comprehend and generate human language.
-
Chapter 3: Transformer
The Transformer, as introduced in [1], is a deep learning model architecture specifically designed for sequence-to-sequence tasks in natural language processing. It revolutionizes NLP by replacing recurrent layers with self-attention mechanisms, enabling it to process entire sequences in parallel, overcoming the limitations of sequential processing in traditional RNN-based models like LSTMs. This architecture has become the foundation for state-of-the-art models in various NLP tasks such as machine translation, text summarization, and language understanding. In this chapter we first introduce the transformer, explore different parts of it (Encoder and Decoder) and finally discuss ways to improve the architecture, such as Transformer-XL and Efficient Transformers.
-
Chapter 4: BERT
BERT (Bidirectional Encoder Representations from Transformers) [1] is a transformer-based model, designed to generate deep contextualized representations of words by considering bidirectional context, allowing it to capture complex linguistic patterns and context-dependent meanings. It achieves this by pretraining on large text corpora using masked language modeling and next sentence prediction objectives, enabling it to learn rich representations of words that incorporate both left and right context information.
-
Chapter 5: Post-BERT Era
This chapter introduces various types of models that build upon the core idea of BERT.
-
Chapter 6: Post-BERT Era 2 and using the Transformer
Here we further introduce models from the Post-BERT era, such as ELECTRA and XLNet. We also discuss how we can reformulate every task into a text-to-text format and finally introduce the T5 model.
-
Chapter 7: Generative Pre-Trained Transformers
In this chapter we will walk you through the history of the GPT models. Starting with GPT-1, then introducting GPT-2 and finally concluding with GPT-3.
-
Chapter 8: Large Language Models (LLMs)
Here we cover Large Language Models and concepts, such as Instruction Fine-Tuning and Chain-of-thought Prompting.
-
Chapter 9: Reinforcement Learning from Human Feedback (RLHF)
Here we cover the basics of RLHF and its related application.