When to Use AWQ vs GPTQ
Both quantize to INT4. AWQ is faster to quantize. GPTQ sometimes has better quality. When does each win?
2 posts tagged with "comparison"
Both quantize to INT4. AWQ is faster to quantize. GPTQ sometimes has better quality. When does each win?
H100 spot at $0.15/1M tokens. A100 on-demand at $0.40/1M. API at $1.00/1M. Here's the full comparison.