The llm primitive is a full wrapper over OpenRouter. It gives your agent a single, OpenAI-compatible chat-completions endpoint that routes to 300+ models across Anthropic, OpenAI, Google, Meta, Mistral, and more — with provider routing, fallbacks, and streaming. You don’t manage an OpenRouter account or key: Naive holds the key and bills each call in Naive credits based on the exact cost OpenRouter returns.
There are two ways to use it:
- The typed primitive —
naive.llm.chat() / naive.llm.stream() / naive.llm.models() in the SDK (plus CLI, MCP, and the agent toolset).
- The drop-in proxy — point any OpenAI or OpenRouter client’s
baseURL at Naive and keep your existing code.
CLI First
# Run a completion
naive llm chat -m anthropic/claude-sonnet-4.6 "Write a haiku about Paris"
# With a system prompt and a fallback model
naive llm chat -m openai/gpt-5.2 --system "You are terse." "Summarize REST in one line" --fallback anthropic/claude-sonnet-4.6
# Browse models (free)
naive llm models claude
Endpoints
| Endpoint | Type | Description | Cost |
|---|
POST /v1/llm/chat/completions | Sync or streaming | OpenAI/OpenRouter-compatible chat completion | Per-token (see Credits) |
GET /v1/llm/models | Sync | List routable models (optionally filtered) | Free |
GET /v1/llm/generation?id= | Sync | Usage/cost stats for a prior completion | Free |
The request and response bodies are exactly OpenRouter’s (which are in turn OpenAI-compatible) — Naive forwards them through. See OpenRouter’s API reference for the full schema.
Chat completions
curl -X POST https://api.usenaive.ai/v1/llm/chat/completions \
-H "Authorization: Bearer nv_sk_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4.6",
"messages": [{ "role": "user", "content": "Summarize our Q3 strategy in 3 bullets." }]
}'
Response (OpenAI-shaped, plus credits_used):
{
"id": "gen-xxxxxxxx",
"model": "anthropic/claude-sonnet-4.6",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": { "role": "assistant", "content": "1. ...\n2. ...\n3. ..." }
}
],
"usage": { "prompt_tokens": 18, "completion_tokens": 42, "total_tokens": 60, "cost": 0.00021 },
"credits_used": 0.0005,
"credits_remaining": 99.9
}
Key parameters
| Param | Type | Description |
|---|
model | string | Model id with provider prefix, e.g. anthropic/claude-sonnet-4.6, openai/gpt-5.2. |
messages | array | OpenAI-style chat messages. (Either messages or prompt is required.) |
models | string[] | Optional fallback chain — OpenRouter tries them in order if earlier ones are unavailable. |
provider | object | OpenRouter provider routing preferences (order, only, ignore, sort, allow_fallbacks, data_collection, …). |
stream | boolean | Stream the response as SSE. |
temperature, top_p, max_tokens, tools, response_format, … | — | Forwarded as-is to OpenRouter. |
Provider routing & fallbacks
Because the body is OpenRouter’s, you get its routing controls for free:
await naive.llm.chat({
model: "anthropic/claude-sonnet-4.6",
models: ["anthropic/claude-sonnet-4.6", "openai/gpt-5.2"], // fallback chain
provider: { sort: "throughput", data_collection: "deny" },
messages: [{ role: "user", content: "Hello" }],
});
Streaming
for await (const chunk of naive.llm.stream({
model: "openai/gpt-5.2",
messages: [{ role: "user", content: "Write a haiku about Paris." }],
})) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
Streaming is Server-Sent Events. The final chunk carries the usage object (including cost); Naive bills it after the stream closes.
Use Naive instead of OpenRouter (drop-in proxy)
If you already use the OpenAI or OpenRouter SDK, you don’t need to change your code — just change the baseURL and key. Naive injects the OpenRouter key server-side and bills your credits.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.NAIVE_API_KEY, // nv_sk_...
baseURL: "https://api.usenaive.ai/v1/proxy/openrouter", // was https://openrouter.ai/api/v1
});
const r = await client.chat.completions.create({
model: "anthropic/claude-sonnet-4.6",
messages: [{ role: "user", content: "Hello" }],
});
The proxy is a transparent passthrough: every path under /v1/proxy/openrouter/* maps to https://openrouter.ai/api/v1/* (so chat/completions, models, generation, etc. all work). It is authenticated by your Naive api key and is not Account-Kit gated — use the typed /v1/llm routes when you want per-tenant AccountKit enforcement.
Multi-tenant
Like other primitives, the typed routes are AccountKit-gated and per-user:
const alice = await naive.users.create({ external_id: "alice" });
const res = await naive.forUser(alice.id).llm.chat({
model: "openai/gpt-5.2",
messages: [{ role: "user", content: "Draft a welcome email." }],
});
Toggle the llm primitive on/off per Account Kit in the dashboard (Account Kits → Primitives → Generation), or via primitives_config.llm.enabled. See Account Kits.
Billing
Naive bills the exact cost OpenRouter reports for each request (usage.cost, in USD) times a small markup, converted to credits ($0.50 = 1 credit). There’s no per-model rate table to keep in sync — token-heavy models simply cost more. Costs are charged after the response completes (after the final chunk, for streams). Listing models is free. See Credits.
The llm primitive is part of agentTools(): the model can route its own sub-calls with naive_run_primitive(primitive: "llm", method: "chat", arguments: { model, messages }), or list models with method: "models".
Error Handling
| Error | Cause | Recovery |
|---|
insufficient_credits | Not enough credits | Top up — see Credits |
not_configured | OpenRouter key not set on the deployment | Operator must set OPENROUTER_API_KEY |
provider_error | OpenRouter/upstream model error | Inspect the message; retry or try another model |
invalid_input | Missing messages/prompt | Provide one |
forbidden | llm disabled by the Account Kit | Enable it in the kit |