LLM routing on your terms.

Combine multiple LLMs into one OpenAI-compatible endpoint. You call one model — the router routes each request to the best LLM.

Router decision trace
Your app sends one request
POST /api/v1/chat/completions
model: "myapp-pro"
"Refactor this Python class to…"
Compound model: myapp-pro
"Route complex refactors to anthropic/claude-opus-4.6. Use minimax/minimax-m2.7 for everyday implementation and quick edits. Send UI/screenshot tasks to moonshotai/kimi-k2.5. Default to anthropic/claude-sonnet-4.6 for everything else."
Selected
Chosen model
MiniMax M2.5
↳ code task · confidence 0.94
fallback →Kimi K2.5 · then DeepSeek V3.2
OpenAI SDK compatible
Self-hostable on Cloudflare
10 battle-tested presets
AGPL-3.0 open source

Make your own compound model.

You select one model name for your product. Connect providers, write routing rules in plain English, and serve a single OpenAI-compatible endpoint — clients always pass that one model.

1

Connect your providers

Add OpenAI, Anthropic, OpenRouter, or any compatible API. These are the models your compound model selects from.

OpenAIAnthropicOpenRouterAny compatible API
2

Define routing logic in plain English

Describe when to use which model. Strong models for hard tasks, cheap ones for simple work, vision models for images. No DSL — just write what you mean.

“Route complex refactors to the strongest code model. Use fast, cheap models for simple edits and boilerplate. Prefer low latency when confidence is similar.”
3

Expose it as that one model

Clients call the model name you chose. Behind it, the router applies your rules each request, pins threads when you want consistency, falls back on failure, and records every decision.

const res = await openai.chat.completions.create({
model: "myapp-pro", // your compound model
messages,
})

Your rules for auto-routing —
with the plumbing included.

You define how models are chosen in plain English and which providers are in play. Fallbacks, thread pinning, circuit breakers, and routing traces are part of the product so automation stays transparent and under your control.

Any Provider, Your Catalog

Connect OpenAI, Anthropic, OpenRouter, or any compatible API. Your compound model selects from exactly the providers you choose.

Plain-English Routing

Describe routing logic in natural language — no DSL, no YAML. Tell the router which models to use and when, and it executes.

10 Presets to Start From

Start with presets like Fast Coding, Speed-First, or Customer Support — then customize the model mix, instructions, and fallbacks to your product.

Smart Thread Pinning

Three modes — re-evaluate every turn, pin for 1–6 turns then re-route, or pin on first message. Use $$route to force re-evaluation mid-thread.

Failover + Circuit Breaker

If a provider fails, the router falls back to the next model in your chain. The circuit breaker tracks error rate, fallback rate, and latency spikes — unhealthy models are auto-disabled for 30 minutes.

Full Observability

Every response includes x-router-model-selected, x-router-request-id, and x-router-degraded headers. See which model handled each request and why.

Start free, scale only when the volume is real.

Every hosted tier includes the full router: auto model routing, thread pinning, circuit breakers, and routing explanations. You can also self-host the same open-source router on your own Cloudflare account.

Free

$0

Evaluate the hosted router with no commitment.

500 requests / day
  • Auto model routing
  • Thread pinning
  • Routing explanations
Most popular

Developer

$7.99/ month

$60 billed yearly

Limited-time promotional pricing.

For individual builders shipping real products with multi-model routing.

5,000 requests / day
  • Everything in Free
  • 5,000 requests / day
  • Monthly or yearly billing
  • Full routing explanations
  • All pin modes
Or choose paid billing now

$7.99 billed monthly

$60 billed yearly

Promo redemptions start Starter immediately and downgrade to Free when the promo ends unless you upgrade separately.

Limited-time promotional pricing.

Save about 37% with yearly billing.

If you need to sign in first, we'll take you straight to Billing with your promo code ready.

Self-hosting

Enterprise

Custom

Custom volume, self-hosted deployment help, and SLA.

Custom request volume
  • Custom daily request limits
  • Assisted self-hosting on your Cloudflare account
  • Priority support with SLA
  • Architecture help for production rollout

All plans use the same router engine. The only difference is daily request volume and support level.

Want full control of your infrastructure? Self-host on your own Cloudflare account →

Prefer to inspect the code first? Browse the open-source repo →