Search Results
2 results for “decoding”
Foundations
Beam Search
Beam search is a heuristic search algorithm used in sequence generation that keeps a fixed number of the most promising partial sequences at each step, balancing output quality against computational cost.
4 min readUpdated June 2026
Infrastructure
Speculative Decoding
Speculative decoding is an inference acceleration technique that uses a small draft model to propose multiple candidate tokens that a larger target model then verifies in parallel, achieving 2-4x throughput gains without changing output quality.
5 min readUpdated June 2026