Prove and govern your AI spend — do more per dollar, attested.
They estimate; we settle.
Change one line and route every provider through one fail-open gateway. See your spend in real time, set hard budget caps and a kill-switch, attribute every dollar by key, team, and feature, and keep an auditable, basis-labeled cost ledger. Verified savings — net of quality, the 25% share — stay off until the proof gate ships; we never bill a saving we can’t prove. OpenAI-compatible, your provider keys pass through, fail-open.
Start free in observe-only mode — no card. Control levels up on the paid tiers. Your data in your tenant; fail-open.
OpenAI-compatible · one line
Every gateway caches and routes for free. We found none that prove what you actually saved.
Caching, batching, and routing ship for $0–500/mo across LiteLLM, Portkey, Helicone, Cloudflare AI Gateway, OpenRouter. We use some of them ourselves. What we haven't found anywhere else is proof that the bill dropped without the product getting worse. That proof is the product. And we measure every method (our levers, your own prompt edits, and any token-reduction tool you already run) through one eval gate and one ledger.
- Caching (exact + semantic)
- Model routing / cascading
- Batch API dispatch
- Spend dashboards
- Source-side reducers (Graphify, h5i)
- The cost ledger: every dollar by route and lever, basis-labeled — verified savings off · proof pending.
- The cross-family eval gate: non-inferiority proven per route, on your real traffic, before anything counts.
- Designed to bill on cost-per-successful-output, so quality is inside the unit you pay on.
Change one line. See and govern every dollar; optimize only what we can prove.
Point your base URL at Recovea
We record your baseline
Safe levers ship; risky ones wait
The ledger shows it all
gate must hold at each rung · one-config rollback any time
An auditable, basis-labeled cost ledger — spend you can attribute and defend.
Every dollar attributed by key, team, feature, and lever, with the basis labeled. Verified savings — baseline minus realized, net of quality, adjustments we didn’t cause stripped — are off · proof pending: they turn on only once the eval gate ships at volume, and the 25% share stays off until then. The method spine is IPMVP performance contracting.
gate must hold at each rung · one-config rollback any time
We never gamble your output quality to save money.
Risky levers run in shadow mode on your real traffic, with zero user exposure. They’re promoted only after a per-route non-inferiority test passes against a layered oracle: deterministic checks → golden-dataset gate → cross-family calibrated judge. One-config rollback at any step.
One line today. Optionally go deeper at the source, soon. Every layer proven.
from openai import OpenAIclient = OpenAI(- base_url="https://api.openai.com/v1",+ base_url="https://api.recovea.ai/v1", # your keys pass through · fail-open api_key=OPENAI_API_KEY,)Tier 1: Gateway live today
Tier 2: Recovea Intake coming v1.1
Stated plainly: a gateway sees your request after it’s assembled, so it can compress and trim what’s already in the prompt, but it can’t decide what your agent retrieves. The deepest reductions need a little app-level cooperation: the optional Intake SDK, which we’re building next. Either way, every reduction passes the same shadow-mode non-inferiority gate and lands in the same ledger; the ledger measures the incremental savings from your traffic.
Realistic, not aspirational.
We lead with what's live today — cost visibility and control — and label everything else honestly. Cache/dedup savings are concierge-verified, never a self-serve billed dollar; verified savings and the 25% share are off · proof pending until the eval gate ships at volume.
| Live today (cost visibility + control) | spend, caps, attribution — from day one |
| Cache / dedup savings | concierge-verified, hands-on · not a self-serve billed dollar |
| Verified savings + the 25% share | off · proof pending — on only once the eval gate ships |
| What we don’t claim | headline routing maxima · never a single giant promise |
Other tools’ routing headlines are conversational benchmarks on the routable fraction, not a blended number. We settle yours on your own traffic — you can verify it offline.
Inference dominates your COGS. Get it under control first.
You just raised; your inference bill grows faster than revenue; every optimization feels like a quality gamble you can’t prove. We change one base URL, give you caps, a kill-switch, and per-team attribution this week, and only prove savings once the eval gate can settle them.
Start free. Pay a flat platform fee for control. Verified savings stay off until proven.
The gateway in observe-only mode meters every request against a frozen price list and gives you read-only spend visibility across providers. Nothing about your traffic changes.
One base_url change, BYO-key, fail-open: cross-provider cost visibility, concierge-verified cache/dedup savings, and an auditable, basis-labeled cost ledger. Verified savings and the 25% share: off · proof pending.
For higher volume and teams that need a security review and the white-glove inference teardown. Published as their own tiers as they ship — talk to us in the meantime.
Numbers, not testimonials.
Every figure carries its conditions, and we show what we deliberately don’t count. Named ledger excerpts land here as our first design-partner cases publish — until then, the proof we can offer is your own scan.
Control on day one; cache/dedup savings concierge-verified; verified savings off · proof pending
You get cost visibility and control immediately. Cache and dedup savings (byte-identical, zero quality risk) are concierge- verified hands-on — never a self-serve billed dollar. Eval-gated savings and the 25% share stay off until the proof gate ships at volume; we never bill a saving we can’t show in the ledger.
Paul Graham publicly described a YC startup doing exactly this (cutting customers’ inference costs and splitting the savings) and sized the market at “a quarter of the model companies’ corporate revenue.”
Built for two buyers.
The technical budget-holder (CTO / CIO / platform lead)
Need to show your board? (a second lane)
Spending under ~$15K/mo? We’ll tell you straight: at that scale you want control and visibility, not a savings pitch. Start free in observe-only mode — we’ll tell you the day a paid tier pays for itself.
We sit inline on your traffic. We treat that seriously.
No prompt bodies logged by default. Scoped access + audit log. Fail-open. Self-host in your own VPC and salted per-tenant cache: planned. SOC 2 on the roadmap.
See and govern your AI spend, before the next surprise bill.
Start free in observe-only mode — no card. Control levels up on the paid tiers; verified savings turn on once we can prove them.