The session is the unit.

Ask a team what they spend on AI coding and you'll get a monthly number. Ask what that number bought and the room goes quiet. The monthly total is real, but it's the wrong altitude — it's the sum of thousands of separate pieces of work, flattened into one figure you can't act on. You can cap it, but you can't steer it.

The fix isn't a better dashboard on top of the total. It's a smaller unit underneath it.

Watch one session

A session is one task moving through a model and a machine. A prompt enters, a harness reasons over it, compute does the work, a result comes back — and every token along the way is counted. Here's one, live:

● session · live 0 tokens
$0.00 · @ $2/1M

● Promptthe task enters

● Harness + modelClaude / Codex reasons

● Compute → resultwork runs, cost lands

One task, one transcript, its cost attached. Illustrative — token flow and price are modeled at cerver's flat $2 / 1M.

That's the whole idea: the cost isn't a line on a monthly invoice anymore — it's a property of the work itself. The $1,400 migration and the $40 overnight loop stop looking the same, because each one carries its own number.

Now look at ten thousand of them

Here's why the unit matters. When you actually price work session by session, the cost isn't a single figure — it's a distribution. A long tail. Most sessions are nearly free; a thin minority cost real money. The monthly total is just the area under this curve, and the per-seat price is one blurry point on it:

typical sessions the expensive tail median — mean —

Illustrative distribution across ~10,000 agentic sessions. The mean sits well right of the median — the tail is where the money actually hides.

This is the picture a monthly total erases. Mean and median are far apart, which means a handful of sessions dominate the bill. You don't fix that with a ceiling — a cap punishes the cheap sessions and the one migration alike. You fix it by being able to see the tail and route around it.

Why the unit changes everything

Once the session is the unit, the moves you couldn't make before become obvious:

Attribution. Cost per task, not per month — so you know which spend earned its keep and which was a runaway loop.
Routing. The session is model-agnostic, so cheap, high-volume work goes to a cheap-good-enough model and the frontier model is reserved for the tail that needs it.
Comparison. Run the same prompt across Claude and Codex on one session, keep the winner, and see what each cost to get there.
Predictability. Priced per session at a flat $2 / 1M tokens, the bill stops being a monthly surprise and becomes a number you can forecast.

The takeaway

Every governance problem in AI spend right now — Uber's cap, Copilot's token-bill shock, Microsoft throttling Claude Code — is the same problem wearing different clothes: the unit was too big to see. Drop to the session and the question flips from "how do we spend less?" to "which spend is working?" — the only question worth asking when the work is this valuable.

A note on the visuals: the session animation and the cost distribution are illustrative models, not measurements of a specific account. Token flow and pricing are shown at cerver's flat $2 / 1M tokens; the distribution is a representative long-tail across a modeled ~10k agentic sessions.

The session is the unit.

Watch one session

Now look at ten thousand of them

Why the unit changes everything

The takeaway

See your own sessions — and your own tail.