Topic 04: Optimization - Part II
This chapter explores advanced topics in neural network optimization, covering key challenges in training stability, initialization techniques, momentum and adaptive learning rates, and activation functions. Addressing issues like ill-conditioning, local minima, and exploding gradients, the chapter discusses methods to improve model convergence and performance, including practical weight initializations, learning rate schedules, and specialized activations for hidden and output layers to overcome common obstacles in deep learning.
-
Chapter 04.01: Challenges in Optimization
In this subsection, we summarize several of the most prominent challenges regarding training of deep neural networks such as ill-conditioning, local minima, saddle points, cliffs and exploding gradients.
-
Chapter 04.02: Measures Regression
In this section we introduce several advanced techniques for optimization of neural network such as learning rate schedules, adaptive learning rates, and batch normalization.
-
Chapter 04.03: Modern Activation Functions
In this subchapter, we explain challenges in optimization related to activation functions. In addition, we introduce activations for both hidden and output units.
-
Chapter 04.04: Network Initialization
In this part we explain why initialization is crucial for neural network training. We introduce the concept of random weight initialization and explain initialization approaches for the biases of a NN.