DeepSeek Releases DSpark, a Speculative Decoding Framework That Accelerates DeepSeek-V4 Per-User Generation 60–85% Over MTP-1

Article automatically generated from technical news.

Most speculative decoding makes you pick one: a fast parallel drafter, or an accurate sequential one. is that a false choice? — and DeepSeek's DSpark just showed why. They released DSpark — a speculative decoding framework, not a new model — that attaches a draft module to existing DeepSeek-V4 weights. It pairs a heavy parallel draft backbone with a tiny Markov head that nudges each token's logits using only t-1, then schedules how many tokens get verified based on real-time G

Fonte originale