- ›
/llm— OpenAI-compatible chat completions across 300+ models through a single Naïve key - ›
Provider routing & fallbacks— reach Anthropic, OpenAI, Google, Meta, and more with automatic failover, no per-provider accounts - ›
Streaming built in— SSE chunks as they arrive, with the final chunk carrying usage and exact cost - ›
Billed in credits— every call is charged at the precise upstream cost, on the same balance as every other primitive - ›
Drop-in proxy— point the OpenAI SDK at /v1/proxy/openrouter and your existing code works unchanged - ›
Composes with everything— the same key powers research, orchestration, and the agents that call all of it
Today we're launching /llm — chat completions across 300+ models behind a single Naïve key. Provider routing and automatic fallbacks, streaming over SSE, exact-cost billing in credits, and a drop-in proxy that makes your existing OpenAI-compatible code work by changing one URL. The same key that sends email, issues cards, and deploys apps now reaches every model worth calling.
The problem: model access is a key-management tax
Every agent needs inference. But getting it usually means collecting accounts and keys like trading cards:
- A key per provider. OpenAI, Anthropic, Google, Mistral — each with its own dashboard, billing relationship, and rotation schedule. Multiply that across a fleet of agents and it's a security and ops problem before it's a product.
- No unified accounting. Spend is scattered across provider invoices. Answering "what did this agent cost to run" means reconciling five bills.
- Routing and fallbacks are DIY. When a provider is down or rate-limited, you write the retry-and-failover logic yourself, in every agent.
For a platform where an agent already has email, payments, and a workforce through one key, making inference the one capability that needs five separate accounts is backwards. Until now.
How /llm works
/llm is a full wrapper over an OpenAI-compatible chat completions API with provider routing built in. You pick a model by slug; Naïve routes the call, applies fallbacks, and bills the exact upstream cost against your credit balance. There are no provider keys in your code — only your nv_sk_* key.
const res = await naive.llm.chat({
model: "anthropic/claude-sonnet-4.6",
messages: [{ role: "user", content: "Summarize this support thread in 3 bullets." }],
});
console.log(res.choices[0].message.content);
console.log(res.usage); // tokens + exact cost in creditsDiscover models from the live catalog:
const { models, count } = await naive.llm.models("claude");Streaming, with cost on the final chunk
For anything user-facing, stream. naive.llm.stream(...) yields OpenAI-compatible chunks as they arrive, and the final chunk carries the usage object — including the precise cost of the call.
const messages = [{ role: "user", content: "Write a launch tweet for our new API." }];
for await (const chunk of naive.llm.stream({ model: "openai/gpt-5.2", messages })) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}Because every call reports its own cost, you can attribute spend per request — per agent, per Employee, per end-user — instead of reconciling a provider invoice at the end of the month.
The drop-in OpenRouter proxy
Already have code written against the OpenAI SDK? Change the base URL and your Naïve key becomes your model key. Nothing else moves.
import OpenAI from "openai";
const openai = new OpenAI({
baseURL: "https://api.usenaive.ai/v1/proxy/openrouter",
apiKey: process.env.NAIVE_API_KEY,
});
const res = await openai.chat.completions.create({
model: "anthropic/claude-sonnet-4.6",
messages: [{ role: "user", content: "Hello" }],
});The proxy speaks the same wire format, so existing OpenAI/OpenRouter clients, frameworks, and tools work unchanged — now on unified Naïve billing and governance.
Call it from the CLI or raw REST
curl -X POST https://api.usenaive.ai/v1/llm/chat/completions \
-H "Authorization: Bearer $NAIVE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4.6",
"messages": [{ "role": "user", "content": "Give me three product names." }]
}'The endpoint is OpenAI-compatible, so any client that can hit /chat/completions can hit /llm.
What you can build with /llm
Power your agents on the same key as everything else — The Employees that send email, run research, and deploy apps already authenticate with a Naïve key. /llm means their inference runs through it too — one balance, one audit trail, one thing to rotate.
Route by task, fail over automatically — Use a fast cheap model for triage and a frontier model for the hard step, all by changing one slug. Provider routing handles outages and rate limits so you don't write failover logic in every loop.
Attribute model spend per customer — In a multi-tenant build, scope /llm to a tenant user and read exact cost per call to meter and bill inference downstream alongside /billing.
Migrate existing code in one line — Point your current OpenAI-based app at the proxy and inherit unified billing, logging, and 300+ models without a rewrite.
Compose with research and generation — Pair /llm with /research for grounded answers and /image for assets — text, web, and media generation all behind one key.
Get started
Drop this starter prompt into any coding agent to wire up Naïve:
Read https://usenaive.ai/skill.md and use it to set up Naïve in my project.
- Read the docs: usenaive.ai/docs/getting-started/llm
- Proxy reference: usenaive.ai/docs/api-reference/overview
- Quickstart: usenaive.ai/docs/getting-started/quickstart
- Join the community on Discord