JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting

Analysis of prevailing constraints limiting advancements in speculative decoding architectures.

Speculative Decoding, AI, Machine Learning, Scaling Limitations
Original Source