Primitive/browser
[ Primitive ]

Introducing /browser: Let your agent operate any website like a human

Spin up a managed Chrome session with stored credentials, watch the browser as it runs, and let your AI agent operate any website — including the legacy SaaS your APIs can't reach.

Dennis Zax·Published April 18, 2026·4 min read·/browser
TL;DR
  • /browser managed Chrome DevTools Protocol sessions for any AI Employee in one call
  • Watch the browser as it runs live view streamed back to the operator dashboard for debugging
  • Persistent credentials pair with /credentials so logins persist across sessions, no re-auth on every run
  • 2FA routing built in pair with /phone so SMS codes route only to the requesting Employee
  • Playwright-compatible bring existing Playwright scripts and they run unchanged
  • Composes with /credentials, /integrations, /phone the answer when an API doesn't exist

Today we're launching /browser — the primitive that lets an AI Employee operate any website. Managed Chrome DevTools Protocol sessions, persistent credentials, 2FA routing, and a live-view stream so you can watch the agent work. The answer for every workflow that doesn't have an API.

The problem: most real work happens in browsers without APIs

The dirty secret of agent infrastructure: APIs cover maybe 30% of the SaaS surface an autonomous business actually needs. The remaining 70% — vendor portals, government filing sites, legacy admin panels, internal SaaS tools, captcha-gated workflows — only work in a browser.

The status quo:

  • Raw Playwright or Puppeteer in your own infrastructure — works for hobby projects; doesn't scale, doesn't handle 2FA, breaks under any anti-bot pressure.
  • Browserbase or Steel.dev for managed sessions — solves the infra problem; you still build the credential layer, the 2FA routing, the audit log, and the live view.
  • Anthropic Computer Use — strong for visual workflows; doesn't compose with the rest of the runtime, no per-Employee scoping, no credential vault.

Until now. With /browser, sessions bind to an Employee, credentials persist via /credentials, 2FA codes route via /phone, and the live view is the operator's debug surface.

How /browser works

The workflow is three steps:

  1. Startensure_browser_session returns a CDP WebSocket URL. The session is bound to the calling Employee.
  2. Drive — connect Playwright, Puppeteer, or any CDP client. Run your automation. Persistent credentials and 2FA routing happen at the runtime layer.
  3. Stopstop_browser_session terminates explicitly, or the session times out per its TTL.

Sessions are real Chrome — full DevTools Protocol surface, accurate fingerprints, residential-quality network paths.

Two ways to drive a session: prompt or code

1. Natural language

naive browser run \
  --employee emp_01HXY... \
  --prompt "Log into our Mercury account and download last month's bank statement"

The CLI starts a session, uses /credentials to pull stored Mercury login, navigates and downloads the statement, and stops the session. The result is a Document object on the Company.

2. Code

import { chromium } from "playwright";

const session = await fetch("https://api.usenaive.ai/v1/browser/sessions", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.NAIVE_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ employeeId: "emp_01HXY..." }),
}).then((r) => r.json());

const browser = await chromium.connectOverCDP(session.cdpUrl);
const page = await browser.contexts()[0].pages()[0];

await page.goto("https://mercury.com/login");

const creds = await fetch(
  "https://api.usenaive.ai/v1/credentials/mercury_login",
  { headers: { Authorization: `Bearer ${process.env.NAIVE_API_KEY}` } },
).then((r) => r.json());

await page.fill("#email", creds.email);
await page.fill("#password", creds.password);
await page.click("button[type=submit]");

// 2FA — poll for the code routed via /phone
const { code } = await fetch(
  "https://api.usenaive.ai/v1/phone/sms-code?employeeId=emp_01HXY...",
  { headers: { Authorization: `Bearer ${process.env.NAIVE_API_KEY}` } },
).then((r) => r.json());
await page.fill("#otp", code);

// ...do the work...

await fetch(`https://api.usenaive.ai/v1/browser/sessions/${session.id}`, {
  method: "DELETE",
  headers: { Authorization: `Bearer ${process.env.NAIVE_API_KEY}` },
});

Watch the browser as it runs

Every session streams a live view back to the operator dashboard. This is the difference between agent automation that works and agent automation you can debug. When the agent gets stuck on a captcha, when a UI changes shape, when a flow stalls — you see it happen in real time.

# Start a session with live view (enabled by default)
naive browser session start --employee emp_01HXY...

# Output includes a liveViewUrl — paste into a browser, watch the agent work

The live view is signed and time-limited; only the operator who owns the Company can view. For teams: invite-only links scoped per session.

Persistent credentials, scoped per Employee

Logins are the slowest part of any browser workflow. Re-authing on every run wastes time and ages the account fingerprint. /browser composes with /credentials so cookies, localStorage, and session tokens persist between runs:

# First run: sign in, then snapshot the session
naive browser session start --employee emp_01HXY...
# ...sign in via Playwright or prompt...
naive credentials snapshot --session sess_... --ref stripe_session

# Subsequent runs: restore from the snapshot
naive browser session start --employee emp_01HXY... --restore-from stripe_session
# Already signed in.

Persistent credentials are encrypted at rest and decrypted only at session start. Audit logs record every restore.

What you can build with /browser

Operate vendor portals that don't have APIs — Cloud provider billing pages, registrar admin panels, Mercury / Brex statements, government filing sites. Compose /browser with /credentials and /phone for full account-opening flows.

Run end-to-end onboarding for new Employees — Sign up for Stripe, Mercury, Google Workspace, Slack — every flow that requires SMS verification, captcha solving, or a few minutes of clicking. The onboarding Employee handles it.

Build agent-native scrapers for sites that block bots — Cloudflare-fronted sites, JS-heavy SPAs, captcha-gated forums. The session pool routes around generic anti-bot fingerprints; the runtime tells you when a target is detecting automation.

Audit your own SaaS surface — Have an Employee log into every tool you use, snapshot pricing, billing, and seat counts, and report any drift. Compose with /email so the snapshot lands in your inbox.

Sandbox high-risk experiments visually — When a new agent loop is unproven, run it in a /browser session with the live view open. Watch what it does before letting it run unattended.

Get started

Frequently Asked Questions
What is /browser?+
/browser is the Naïve primitive that gives an AI Employee a real, fully-featured Chrome browser session. It supports Playwright, Puppeteer, and CDP-compatible tooling, persists credentials across sessions via /credentials, routes 2FA codes correctly via /phone, and streams a live view back to the operator dashboard for debugging.
How does /browser work?+
Call ensure_browser_session and Naïve provisions a Chrome instance bound to the Employee. Connect via the returned WebSocket URL with Playwright or any CDP client. The session persists for the configured TTL; stop_browser_session terminates it explicitly. Every action is logged.
When should I use /browser vs /integrations?+
Use /integrations when an API exists for what you need (Slack, GitHub, Webflow, Shopify). Use /browser when no API exists — vendor portals, government filings, legacy SaaS, captcha-gated workflows. /browser is the universal fallback; /integrations is the safer, cheaper, faster path when both exist.
How does /browser handle anti-bot detection?+
Sessions use real residential-quality fingerprints, real-Chrome user-agents, and human-like cadences. For high-friction targets (Cloudflare-protected sites, captcha walls), Naïve uses Browserbase as the underlying session provider, which maintains fingerprint pools that avoid most generic anti-bot signals. Targets that detect any automation will still detect /browser — there's no magic.
How much does /browser cost?+
Sessions are billed per-hour of active time. Live-view streaming is included. See the pricing page for current rates.
What's the difference between /browser and Anthropic Computer Use, Browserbase, or Playwright direct?+
Anthropic Computer Use is a model-side interface for visual browser control. Browserbase is a managed CDP session provider. Playwright is a client library. /browser composes a managed CDP session (via Browserbase under the hood) with credential persistence (/credentials), 2FA routing (/phone), and Employee identity binding — so the agent can complete real workflows, not just navigate to a URL.
How do I get started with /browser?+
Run naive browser session start --employee emp_01HXY... and Naïve returns a WebSocket URL you can connect to with Playwright. The full quickstart is at usenaive.ai/docs/getting-started/quickstart.
DZ
Dennis ZaxCTO

CTO of Naïve. Building the open-source agent runtime.

@denniszax