In this section, we show how Gaussian processes are actually trained using maximum likelihood estimation and exploiting the fact that we can learn covariance functions’ hyperparameters on the fly.