Chapter 05.01: k-Nearest Neighbors (k-NN)
We demonstrate that distances in feature space are crucial in \(k\)-NN regression / classification and show how we can form predictions by averaging / majority vote. In this, \(k\)-NN is a very local model and works without distributional assumptions.
Lecture video
Lecture slides
Quiz
---
shuffle_questions: false
---
## Which statements are true?
- [x] Choosing the distance metric is a crucial design decision for $k$-NN.
- [ ] $k$-NN can only be used for classification tasks.
- [x] $N_k(x)$ contains the subset of the feature space $\mathcal{X}$ that is at least as close to $x$ as the $k$-th closest neighbor of $x$ in the training data set.
- [x] 1-NN always 'predicts' perfectly on observations of the training data set (if there are no observations with equal feature but different target values).
- [x] $k$-NN with $k = n$ always predicts the same target variable value for all possible inputs (if no weights are used).
- [ ] $k$-NN for classification is a probabilistic classifier.