By user & team
Rank spend by person and team. See who's driving the bill — and what they actually got for it.
For FinOps & cost owners
Hard-code the biggest model and the biggest box for everything, and you overpay on every routine task — with no idea who spent what. cerver makes every session queryable, and matches the resource to the need, not the worst case.
Without cerveryour AI bill, this month
10 charges · by team? · by verdict? — no idea.
With cerversame spend, queryable
01 — The trap
Most AI setups hard-code the biggest model and the biggest machine, then run every task through it — the one-line rename and the overnight migration, billed the same. Worse, the spend lands as a single monthly total: you can't see which user, app, or agent ran it up, so the only lever left is a blunt cap.
02 — Query the sessions
Every piece of work is a session, and every session is queryable — cost, tokens, model, and compute bound together. Group by user, by app, by agent, by day. The runaway that's been looping all night stops hiding in the aggregate.
Rank spend by person and team. See who's driving the bill — and what they actually got for it.
Attribute every dollar to the app, agent, or task that earned it — not a monthly lump nobody can decompose.
Query running sessions over a threshold and kill the one that's looping — before it bills, not after.
03 — Match the resource
Because the session is model- and compute-agnostic, you route each task to the resource it actually needs: a cheap model and a small box for routine work, the frontier model only for the tail that earns it. You stop paying flagship rates for everything by reflex.
GET /v2/sessions?group=user&since=30d → cost per user, ranked who spent what
GET /v2/sessions?status=running&over=5 → the runaways, live gate before it bills
POST /v2/sessions/:id/run-llm model:auto → right-size per task cheap unless it needs more
Today10,000 sessions · all on the frontier model
Right-sizedsame 10,000 sessions · matched to the task
Each square is 50 sessions. Illustrative per-session averages: frontier $0.40, mid $0.12, open-source $0.03 — your real mix comes from your real sessions, queryable by model. Set the mix as a routing policy →
04 — Own the record
A token bill is a black box: you get a number and a "trust us." When every session is a transcript you own, that changes. You hold the exact record of what was sent and what came back — token for token — so a charge that looks wrong becomes a charge you can actually dispute, with receipts instead of a shrug. Owning your transcripts isn't just good hygiene; it's the only leverage you have when the meter and the invoice disagree.
05 — Save where you can
Across the teams we benchmark, a third to nearly half of AI-coding spend is avoidable — oversized routing, looping agents, work that never needed the frontier model. Query it, match it, gate it. Permission to run lives in one place; the cost of every run lives right next to it.
06 — Install
curl -fsSL https://cerver.ai/install.sh | bash
Bring your own compute. cerver's price is a flat $2 / 1M tokens, visible per session — so the bill is something you forecast, not absorb.