We start this chapter by taking a look at the Gradient descent algorithm. It is an iterative, first-order optimization algorithm for finding a local minimum of a differentiable function.