Cerver Agent Protocol
This page describes the current execution gateway surface in api/cerver-api. The field names and envelopes below match the live contract.
Three calls are enough.
Open a session, stream the run, then read the metrics. The gateway owns provider choice and execution visibility.
POST /gateway/sessions POST /gateway/sessions/:id/run/stream GET /gateway/sessions/:id/metrics
Create a session
{
"task": "boot a preview and run a smoke check",
"workload": "preview",
"repo": {
"name": "branch-monkey",
"framework": "nextjs",
"languages": ["typescript"]
},
"requirements": {
"runtime": "node",
"package_install": true,
"public_preview": true,
"persistence_level": "medium",
"timeout_minutes": 15
},
"policy": {
"mode": "balanced",
"allowed_providers": ["vercel", "cloudflare", "e2b"]
}
}
Session response
{
"session_id": "ses_123",
"session_name": "branch-monkey",
"status": "ready",
"provider": "vercel",
"sandbox_id": "sbx_123",
"metrics": {
"provision_time_ms": 820,
"predicted_startup_ms": 820,
"cost_estimate_usd": 0.123
},
"routing": {
"recommended_provider": "vercel",
"confidence": "high",
"fallback_order": ["cloudflare", "e2b"],
"canary_run": false
}
}
Sessions need a compute. Attach one first.
A new account has no compute attached, so POST /v2/sessions returns 409 with a recommendation report. Pick one path:
Option A — Local relay
One command on the machine you want to expose as compute (laptop, mac mini, server):
curl -fsSL https://kompany.dev/install-cerver.sh | bash
Installs uv if missing, runs the relay, opens a browser to log you in, registers the host as a private compute on your account. Self-updates by polling GitHub — leave it running on an always-on machine and it stays current.
Option B — BYO cloud (Vercel, e2b)
Enable a provider with your own credentials in the dashboard at cerver.ai/dashboard/providers, or via API:
curl -X POST https://gateway.cerver.ai/v2/account/providers \
-H "Authorization: Bearer $CERVER_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"provider": "vercel", "credentials": {"vercel_token": "..."}}'
Verify
curl https://gateway.cerver.ai/v2/computes \ -H "Authorization: Bearer $CERVER_API_TOKEN" # should list at least one compute
Bring your own secrets store.
Cerver intentionally does not store user app-secrets — those belong in a tool built for it. The cerver-mcp package ships a secret_fetch(name) tool that gives the agent one uniform interface regardless of backend.
Default: env backend
export BUFFER_API_KEY=...
uvx cerver-mcp
# agent: secret_fetch("BUFFER_API_KEY") → reads from process env
Production: Infisical backend
{
"mcpServers": {
"cerver": {
"command": "uvx",
"args": ["cerver-mcp"],
"env": {
"CERVER_API_TOKEN": "ck_...",
"CERVER_SECRETS_BACKEND": "infisical",
"INFISICAL_TOKEN": "st...",
"INFISICAL_PROJECT_ID": "...",
"INFISICAL_ENVIRONMENT": "prod"
}
}
}
}
Audited, rotateable, never stored on the relay disk. Future backends (1Password, AWS, GCP) plug in the same way.
Sessions are also transcripts.
Every cerver session keeps the full turn-by-turn transcript on its transcript[] field. That makes the same primitive a memory layer any agent on the account can read — your cron from yesterday, the code reviewer from last week, an agent in a sibling app on the same key.
No separate retrieval service to wire up. No vector DB. Just GET /v2/sessions.
From plain HTTP
curl https://gateway.cerver.ai/v2/sessions?limit=20 \ -H "Authorization: Bearer $CERVER_API_TOKEN" curl https://gateway.cerver.ai/v2/sessions/SESSION_ID \ -H "Authorization: Bearer $CERVER_API_TOKEN" # → returns the full session record including transcript[]
From an MCP-aware agent
Drop the API key into the agent's MCP config once. The cerver-mcp package surfaces three tools.
{
"mcpServers": {
"cerver": {
"command": "uvx",
"args": ["cerver-mcp"],
"env": { "CERVER_API_TOKEN": "ck_..." }
}
}
}
First-time vs N-th-time
A first-time agent on a fresh account: cerver_session_list returns empty — start a new session, do the work. An agent on an account with prior runs: list returns sibling agents' history, peek into the relevant ones, inherit context, then act. The session you create afterwards becomes part of the same memory pool the next agent reads.
What the agent sends
Agents should describe the work, requirements, and policy. Vendor choice remains a gateway concern.
{
"task": "boot a preview and run a smoke check",
"workload": "preview",
"repo": {
"name": "branch-monkey",
"framework": "nextjs",
"languages": ["typescript"],
"signals": ["public preview expected", "package install likely"]
},
"requirements": {
"runtime": "node",
"package_install": true,
"public_preview": true,
"persistence_level": "medium",
"timeout_minutes": 15
},
"policy": {
"mode": "balanced",
"allowed_providers": ["vercel", "cloudflare", "e2b"],
"max_cost_usd": 0.2,
"max_startup_ms": 2500
},
"metadata": {
"importance": "high"
}
}
What Cerver can route today
This section reflects the current implementation, not the long-term roadmap.
How Cerver decides
- Filter providers that cannot satisfy the hard requirements.
- Score the remaining providers for cost, speed, reliability, and fit.
- Run a canary only when confidence is low or the run matters enough to compare.
- Return one winner, one fallback path, and the reasons behind the choice.
Hard filters
Cerver filters a provider out before scoring if any of these are true:
- The provider is blocked by allowed_providers.
- The request needs live execution and the provider is not executable in Cerver yet.
- The runtime, browser, desktop, preview, timeout, or cost ceilings cannot be met.
- The provider startup estimate exceeds max_startup_ms.
Routing modes
Confidence and canary rules
- Confidence is high when the top executable score is strong and the gap to second place is large.
- Confidence drops to medium or low as the score or the gap shrinks.
- Cerver sets canary_run when the top two executable providers are close, or when the workload is long_running.
Session lifecycle and metrics
A gateway session is the canonical record for routing, transcript, events, and metrics, even though execution happens inside a provider sandbox.
Metrics tracked today
What comes back to the agent
The response envelope should be executable by an agent and readable by a human without translation.
{
"request_id": "req_123",
"task": "boot a preview and run a smoke check",
"workload": "preview",
"providers": [
{
"name": "vercel",
"status": "viable",
"score": 0.862,
"estimated_cost_usd": 0.123
}
],
"decision": {
"recommended_provider": "vercel",
"confidence": "high",
"primary_reason": "Vercel Sandbox is the best fit for short preview-style execution right now.",
"secondary_reasons": [
"Supports preview-style execution cleanly",
"Available for live routing through Cerver today"
],
"fallback_order": ["cloudflare"],
"canary_run": false
},
"human_summary": {
"headline": "Route this task through Vercel Sandbox.",
"next_action": "Start a session on Vercel Sandbox and keep Cloudflare available."
}
}
Session metrics response
{
"session_id": "ses_123",
"provider": "vercel",
"status": "idle",
"routing": {
"recommended_provider": "vercel",
"confidence": "high"
},
"metrics": {
"provision_time_ms": 820,
"time_to_first_exec_ms": 1110,
"last_exec_latency_ms": 920,
"average_exec_latency_ms": 980,
"average_stream_open_latency_ms": 350,
"total_exec_count": 3,
"total_stream_count": 1,
"interaction_count": 4,
"session_length_ms": 440000,
"cost_estimate_usd": 0.071,
"uptime_percent": 99.3,
"predicted_startup_ms": 820,
"engagement_score": 0.61,
"engagement_label": "engaged"
}
}
Gateway endpoints
POST /gateway/recommend
Ask Cerver to pick a provider
Use when the agent wants a recommendation before opening a session.
POST /gateway/sessions
Open one logical execution session
Use when the agent is ready to start work through the gateway.
POST /gateway/sessions/:id/input
Store chat or task input
Use to append user, assistant, or system messages into the gateway transcript.
POST /gateway/sessions/:id/run
Execute without streaming
Use when the agent wants a JSON result wrapper with duration and metrics.
POST /gateway/sessions/:id/run/stream
Execute and stream output
Use when the agent wants one stream regardless of which provider wins.
GET /gateway/sessions/:id/metrics
Read provider and latency data
Use to explain what happened and decide whether policy should change.
GET /gateway/sessions/:id
Read the session record
Use to inspect routing, transcript, events, and fresh metrics together.
DELETE /gateway/sessions/:id
Terminate the logical session
Use to clean up both the gateway record and the backing sandbox.
POST /gateway/stress-tests
Compare providers under the same workload
Use to benchmark cold starts, streaming, long jobs, and recovery behavior.
When to benchmark instead of guessing
- When a repo is new and Cerver has little history for it.
- When the run is expensive or important enough to justify a canary.
- When providers have changed in speed, health, or cost.
- When the team wants a repeatable comparison report across vendors.
Kinds supported now
The current gateway understands cold_start, stream_response, package_install, preview_launch, and long_session.
Stress-test request
{
"kind": "cold_start",
"workload": "preview",
"providers": ["vercel", "cloudflare", "e2b"],
"sample_size": 12,
"requirements": {
"runtime": "node",
"public_preview": true,
"timeout_minutes": 15
},
"policy": {
"mode": "balanced"
}
}
Stress-test result
{
"testId": "test_123",
"kind": "cold_start",
"workload": "preview",
"sampleSize": 12,
"winner": "vercel",
"summary": "Vercel Sandbox is the strongest cold start candidate.",
"report": {
"headline": "Vercel Sandbox is the strongest cold start candidate.",
"guidance": "Across 12 simulated runs, Vercel Sandbox shows the best mix of startup time, cost, and availability.",
"next_policy": "Route preview runs to vercel first, then fall back to cloudflare, e2b."
}
}