Chapter 02.01: Basic Training
This subchapter covers essential principles of neural network training, starting with empirical risk minimization (ERM) and gradient descent (GD). Additionally, it introduces stochastic gradient descent (SGD) as a computationally efficient alternative to GD.