Back to Blog

Knowing Which Feature Burns Money

Your invoice says $47,000. Your product has chat, search, summarization, and code generation. Which feature is responsible for $30,000 of that?

Most teams can't answer this question. They optimize blindly, shaving tokens from prompts that don't matter while ignoring the feature that's hemorrhaging money.

The Attribution Gap

LLM costs look simple: input tokens + output tokens = cost. But tracing those tokens back to product decisions requires infrastructure that most teams skip.

# What most teams do
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages
)

# What you need
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    user=f"feature:{feature_id}|user:{user_id}|session:{session_id}"
)

That user field is your attribution hook. Every request gets tagged with the feature that triggered it.

Building the Attribution Pipeline

from dataclasses import dataclass
from typing import Optional
import json

@dataclass
class CostEvent:
    feature_id: str
    user_id: str
    input_tokens: int
    output_tokens: int
    model: str
    timestamp: float

def calculate_cost(event: CostEvent) -> float:
    # Model-specific pricing (as of writing)
    pricing = {
        "gpt-4": {"input": 0.03, "output": 0.06},
        "gpt-4-turbo": {"input": 0.01, "output": 0.03},
        "gpt-3.5-turbo": {"input": 0.0005, "output": 0.0015},
        "claude-3-opus": {"input": 0.015, "output": 0.075},
        "claude-3-sonnet": {"input": 0.003, "output": 0.015},
    }

    rates = pricing.get(event.model, {"input": 0.01, "output": 0.03})
    input_cost = (event.input_tokens / 1000) * rates["input"]
    output_cost = (event.output_tokens / 1000) * rates["output"]

    return input_cost + output_cost

Log every request. Aggregate by feature. The queries write themselves:

SELECT
    feature_id,
    SUM(input_tokens) as total_input,
    SUM(output_tokens) as total_output,
    SUM(cost) as total_cost,
    COUNT(*) as request_count
FROM cost_events
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY feature_id
ORDER BY total_cost DESC;

What You'll Find

Teams that implement attribution typically discover:

  • One feature consumes 60-80% of total spend
  • That feature often isn't the most used, just the most expensive per call
  • "Small" features with long system prompts can be surprisingly costly

A real example: A customer support product found their "smart routing" feature (deciding which agent handles a ticket) was using GPT-4 for what could be done with a fine-tuned small model. Cost: $12K/month for a feature that could run on $200/month.

Per-Feature Budgets

Once you have attribution, you can set budgets:

FEATURE_BUDGETS = {
    "chat": {"daily_limit": 1000.00, "alert_at": 0.8},
    "search": {"daily_limit": 200.00, "alert_at": 0.8},
    "summarization": {"daily_limit": 500.00, "alert_at": 0.8},
}

def check_budget(feature_id: str, accumulated_cost: float):
    budget = FEATURE_BUDGETS.get(feature_id)
    if not budget:
        return  # No budget set

    if accumulated_cost > budget["daily_limit"]:
        raise FeatureBudgetExceeded(feature_id)

    if accumulated_cost > budget["daily_limit"] * budget["alert_at"]:
        alert_approaching_limit(feature_id, accumulated_cost)

This prevents any single feature from running away with your budget. More importantly, it forces conversations about whether a feature is worth its cost.

The User Dimension

Attribution by feature is level one. Level two adds user segmentation:

SELECT
    feature_id,
    user_tier,  -- 'free', 'pro', 'enterprise'
    AVG(cost) as avg_cost_per_request,
    PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY cost) as p95_cost
FROM cost_events
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY feature_id, user_tier;

You might find that free users generate 40% of your costs but 5% of revenue. That's not a technical problem. That's a business problem that technical data surfaced.

What Attribution Enables

Without attribution, cost optimization is guesswork. With it:

  • Engineers know which prompts to optimize
  • Product knows which features are profitable
  • Finance can forecast based on feature usage
  • Support can identify cost anomalies per customer

The $47K invoice becomes: chat ($28K), search ($8K), summarization ($7K), code gen ($4K). Now you know where to look.

Start tagging requests today. The data you collect now is the data you'll need when the bill doubles.