Chapter 02.01: Basic Training

This subchapter covers essential principles of neural network training, starting with empirical risk minimization (ERM) and gradient descent (GD). Additionally, it introduces stochastic gradient descent (SGD) as a computationally efficient alternative to GD.

Lecture slides

Chapter 02.02: Chain Rule and Computational Graphs »