Chapter 15.09: Weight decay and L2

In this section, we show that L2 regularization with gradient descent is equivalent to weight decay and see how weight decay changes the optimization trajectory.