Posts tagged "scheduling" | PeakInfer Blog

Mar 26, 2025

A 10,000-token request takes 20 seconds. Behind it, a hundred 50-token requests wait. Is that fair? What even is fair in LLM serving?