Infrastructure for AI agents.

cerver is the thin layer under your AI: every session carries its transcript, model, compute, and cost as one object you control. Start online in seconds — attach your own machines when the work needs them.

See the report

no card · $5 free tier · local runs on the subscriptions you already pay

01 — The mix

Stop running everything on the most expensive model.

Most fleets run 100% frontier by reflex. Your work isn't 100% frontier-shaped: about a quarter earns the best model, the rest doesn't notice the difference. A routing policy — rules you write, or one line of auto — makes the mix deliberate.

Today · all frontier

10,000 sessions$4,000 / mo

Right-sized · 25 / 35 / 40

same 10,000 sessions$1,540 / mo · −62%
Browse routing policies → What it saves you

02 — Swap underneath

Keep the transcript. Swap the agent and the machine under it.

Same stage, same actor. The harness and the compute are props you can change mid-run — the transcript, the tools, and the identity stay bound.

transcript · sess_42live · 14 turns
harnessclaude
computevercel

03 — Compare

Run the same task on two agents. Stay on whichever wins.

The comparison runs itself, in one session. A real diff — not vibes.

$ cerver run --compare claude codex "add idempotency to the webhook"

  claude / sonnet   passes · retries on 5xx
  codex  / gpt-5   passes · cleaner backoff   ← better

→ you're now on codex for this task.

04 — Anywhere

Start online. Add the machine when the work needs one.

Hosted model sessions work instantly. When the session needs your repo, terminal, tools, or CLI agents, attach the relay and keep the same session boundary.

Models
Claude Opus 4.8
GPT-5
Grok 4
Gemma 4
Runtimes
Claude Code
Codex CLI
OpenAI SDK
xAI
Compute
Vercel
E2B
Cloudflare
Modal
your machine

05 — At scale

Run thousands of agents in parallel. See every one.

Fan out to thousands of sessions at once. Click any agent and watch its session — the chat, the model, the compute, the cost.

Spending limits, on by default

Every account ships with a monthly cap. A runaway agent stops at your number — never at a $17k surprise.

Metered, capped, predictable

$2 per 1M tokens, billed monthly for what you actually used. No usage, $0 invoice — and your cap bounds the bill.

Your subscriptions do the work

On your own machines, sessions ride the Claude Max / ChatGPT plans you already pay for. Marginal cost: ~zero.

06 — Start

Start online first. Then connect your own computer — it runs on the subscriptions you already pay for.

$ curl -fsSL https://cerver.ai/install.sh | bash

Online sessions work instantly, no setup. This one command then connects your machine: sessions run there through your existing Claude Max or ChatGPT plan — no per-token bills — with your repos, tools, Claude Code, and Codex right where they live.

under a minute · $5 free tier