Calculating If Quantization Pays Off
Quantization saves memory. But does it improve cost per token? The ROI depends on whether you're memory-bound or compute-bound.
1 post tagged with "roi"
Quantization saves memory. But does it improve cost per token? The ROI depends on whether you're memory-bound or compute-bound.