Topic 2: Optimization - Part I
This chapter introduces foundational concepts in training neural networks, including backpropagation and computational graphs, which provide a structured approach for calculating gradients for optimizing model parameters. It also discusses gradient descent and stochastic gradient descent and touches on strategies for effective learning, such as learning rate selection and weight initialization.
-
Chapter 02.01: Basic Training
This subchapter covers essential principles of neural network training, starting with empirical risk minimization (ERM) and gradient descent (GD). Additionally, it introduces stochastic gradient descent (SGD) as a computationally efficient alternative to GD.
-
Chapter 02.02: Chain Rule and Computational Graphs
In this subsection, we explain the chain rule of calculus, and the corresponding computational graphs.
-
Chapter 02.03: Basic Backpropagation I
This subsection introduces forward and backward passes, the chain rule, and the details of backpropagation in deep learning.
-
Chapter 02.04: Basic Backpropagation II
In this subsection we focus on the formalism of backpropagation and the concept of recursion in this context.
-
Chapter 02.05: Hardware and Software
This subsection introduces GPU training for accelerated learning of neural networks, software for hardware support, and deep learning software platforms.