Chapter 05.01: k-Nearest Neighbors (k-NN)

We demonstrate that distances in feature space are crucial in \(k\)-NN regression / classification and show how we can form predictions by averaging / majority vote. In this, \(k\)-NN is a very local model and works without distributional assumptions.

Lecture video

Lecture slides

Quiz

--- shuffle_questions: false --- ## Which statements are true? - [x] Choosing the distance metric is a crucial design decision for $k$-NN. - [ ] $k$-NN can only be used for classification tasks. - [x] $N_k(x)$ contains the subset of the feature space $\mathcal{X}$ that is at least as close to $x$ as the $k$-th closest neighbor of $x$ in the training data set. - [x] 1-NN always 'predicts' perfectly on observations of the training data set (if there are no observations with equal feature but different target values). - [x] $k$-NN with $k = n$ always predicts the same target variable value for all possible inputs (if no weights are used). - [ ] $k$-NN for classification is a probabilistic classifier.