Chapter 01.08: Components of a Learner

Nearly all supervised learning algorithms can be described in terms of three components: 1) hypothesis space, 2) risk, and 3) optimization. In this section, we explain how these components interact and why this is a very useful concept for many supervised learning approaches.

Lecture video

Lecture slides

Quiz

--- shuffle_questions: false --- ## Which statements are true? - [x] For a given hypothesis space, different optimization procedures can be used to find the best model within it. - [ ] Providing two different training data sets to a learner will result in the same optimal model. - [x] The parameterization of a model defines its hypothesis space. - [x] Supervised learning consists of three components: hypothesis space, risk, and optimization. ## Which statements are true? - [x] If a hypothesis space can be understood as a parameterized family of curves, finding the optimal model is equivalent to finding the optimal set of parameter values. - [x] Supervised ML requires having labeled data to train the model. - [ ] A learner is a function that maps feature vectors to predicted target values. - [ ] The risk function does not depend on the choice of the loss function. ## Which statements are true? - [ ] The idea of Gradient Descent (GD) is to iteratively go from the current candidate θ[t] in the direction of the positive gradient, with learning rate α to the next θ[t+1]. - [x] Empirical risk minimization (ERM) leads to finding the model with the lowest average loss (in the absence of regularization). - [ ] A learner outputs the best parameters and hyperparameters. - [ ] Supervised ML is always about learning to predict, and never about learning to explain. ## Which statements are true? - [ ] In supervised ML, there are two tasks: Regression for categorical target variables, and classification for numerical ones. - [x] An algorithm that - given some hypothesis space H, training data D, and hyperparameter control settings λ - returns one element of the hypothesis space H, is called a learner. - [x] A hypothesis space H is a set that can have an infinite number of elements. - [ ] The empirical risk function allows us to associate a quality score with each of our models: the higher the empirical risk, the better a model fits our training data.