Speculative Ensemble: Fast Large Language Model Ensemble via Speculation
Jiale Fu* , Yuchu Jiang* , Junkai Chen , Jiaming Fan , Xin Geng , Xu Yang
Published in Arixv, 2025
Speculative Ensemble is a novel framework that accelerates the ensemble of any number of LLMs without sacrificing performance. It could reach 1.11x-2.23x over standard ensemble techniques on two-model or three-model pairs.
Recommended citation: Fu J, Jiang Y, Chen J, et al. Speculative Ensemble: Fast Large Language Model Ensemble via Speculation[J]. arXiv preprint arXiv:2502.01662, 2025.
Download Paper