Chapter 08.03: Stochastic Decoding & CS/CD
In this chapter you will learn about more methods beyond simple deterministic decoding strategies. We introduce sampling with temperature, where you add a temperature parameter into the softmax formula, top-k [1] and top-p [2] sampling, where you sample from a set of top tokens and finally contrastive search [3] and contrastive decoding [4].
Lecture Slides
References
- [1] Fan et al., 2018
- [2] Holtzman et al., 2019
- [3] Su et al., 2022
- [4] Li et al., 2023