Chapter 15.09: Weight decay and L2

In this section, we show that L2 regularization with gradient descent is equivalent to weight decay and see how weight decay changes the optimization trajectory.

Lecture video

Lecture slides