#production

6 posts tagged with "production"

Dec 31, 2025

A Year of LLM Inference: Lessons Learned

Looking back at what we learned deploying LLM inference in production. What worked, what didn't, and what we'd do differently.

Oct 29, 2025

Running Fine-tuned Models in Production

Fine-tuning a model is the easy part. Running it in production with checkpoints, evals, rollback, and serving is the hard part. Here's the full picture.

Sep 6, 2025

How the Big Labs Actually Do Evals

Evals at Anthropic, OpenAI, and Google aren't afterthoughts. They're gating functions that block releases. Every prompt change triggers the full suite.

Aug 2, 2025

What Production LLM Systems Need to Survive

The gap between 'works on my laptop' and 'survives production' is filled with timeouts, retries, fallbacks, and rate limits. Here's the checklist.

Jun 18, 2025

Taking PyTorch Models to Production

Raw PyTorch is 3-5x slower than optimized serving. Here's the gap and how to close it.

May 28, 2025

The Checklist Before You Deploy

12 things to check before your LLM goes to production. Most teams skip at least half. That's how incidents happen.