Understand Your LLM
Inference Topology
Reconstruct the complete inference map from your codebase. See models, vendors, runtimes, hardware, costs, and performance—all without telemetry or vendor lock-in.
One Command. Complete Visibility.
No configuration. No telemetry. No vendor lock-in. Just run one command and get your complete inference StackMap.
Everything You Need to Optimize Inference
PeakInfer gives you the intelligence layer that sits above all vendors, runtimes, and hardware.
StackMap Knowledge Graph
Reconstruct complete inference topology from code. Maps models, vendors, runtimes, hardware, and dataflows into a canonical knowledge graph.
Pricing Delta Engine
Real-time pricing intelligence across all providers. Track cost deltas, spot pricing, and find the best deals for your architecture.
Static Code Analysis
Detects LLM calls, routing logic, retry patterns, batching, caching, and more across Python, TypeScript, Go, and Java.
Cost Optimization
Identifies hotspots, suggests alternatives, and calculates cost savings. Compare vendors, runtimes, and hardware options.
Multi-Provider Comparison
Compare OpenAI, Anthropic, Together, Fireworks, and 20+ providers side-by-side. See pricing, latency, and performance deltas.
Privacy First
All analysis happens locally. No telemetry. No cloud accounts. Your code never leaves your machine (except for Claude Code SDK analysis).
Visualize Your Inference Stack
See how your codebase connects to models, vendors, runtimes, and hardware in an interactive StackMap.
The Bloomberg Terminal of Inference
Real-time pricing intelligence across all providers. Track deltas, find savings, and optimize costs.
Real-Time Pricing Intelligence
Updated weekly from public sources and community contributions
| Vendor | Model | Input / 1M tokens | Output / 1M tokens | Monthly Cost | Price Delta |
|---|---|---|---|---|---|
| OpenAI | gpt-4o | $2.50 | $10.00 | $890 - $1,290 | 12% |
| Anthropic | claude-sonnet-4 | $3.00 | $15.00 | $210 - $380 | — |
| Together | llama-3-70b | $0.20 | $0.20 | $50 - $70 | 8% |
| Fireworks | llama-3-70b | $0.15 | $0.15 | $38 - $52 | 24% |
peakinfer pricing for detailed comparisons.Built for AI Engineering Teams
Codebase Audits
Understand exactly how LLMs are used across your entire codebase. Find all inference callsites, routing logic, and optimization opportunities.
- Complete inference inventory
- Hotspot identification
- Pattern detection
Cost Optimization
Reduce inference costs by 20-40% through intelligent model selection, vendor comparison, and optimization suggestions.
- Monthly cost estimates
- Alternative provider suggestions
- Batching and caching recommendations
PR Reviews
GitHub Actions integration shows StackMap changes, pricing deltas, and optimization opportunities in every PR.
- Automatic PR comments
- Cost regression detection
- Team-wide visibility
Architecture Planning
Make informed decisions about vendors, runtimes, and hardware. Compare options side-by-side with real pricing data.
- Vendor-agnostic comparisons
- Hardware cost modeling
- Runtime efficiency analysis
Ready to Optimize Your Inference?
Get started with PeakInfer in seconds. No signup required. No cloud accounts. Just one command.
$ npm install -g peakinfer
$ peakinfer analyze .