Introducing /browser: Let your agent operate any website like a human

TL;DR

›/browser — managed Chrome DevTools Protocol sessions for any AI Employee in one call
›Watch the browser as it runs — live view streamed back to the operator dashboard for debugging
›Persistent credentials — pair with /credentials so logins persist across sessions, no re-auth on every run
›2FA routing built in — pair with /phone so SMS codes route only to the requesting Employee
›Playwright-compatible — bring existing Playwright scripts and they run unchanged
›Composes with /credentials, /integrations, /phone — the answer when an API doesn't exist

🚧 Capabilities evolving — roadmap preview. /browser is not yet generally available in the public Naïve API. The surfaces, names, and parameters described below are illustrative of the direction the platform is heading and may change before launch. For what's live today, see usenaive.ai/docs.

Today we're launching /browser — the primitive that lets an AI Employee operate any website. Managed Chrome DevTools Protocol sessions, persistent credentials, 2FA routing, and a live-view stream so you can watch the agent work. The answer for every workflow that doesn't have an API.

The problem: most real work happens in browsers without APIs

The dirty secret of agent infrastructure: APIs cover maybe 30% of the SaaS surface an autonomous business actually needs. The remaining 70% — vendor portals, government filing sites, legacy admin panels, internal SaaS tools, captcha-gated workflows — only work in a browser.

The status quo:

Raw Playwright or Puppeteer in your own infrastructure — works for hobby projects; doesn't scale, doesn't handle 2FA, breaks under any anti-bot pressure.
Managed CDP providers like Steel.dev — solve the infra problem; you still build the credential layer, the 2FA routing, the audit log, and the live view.
Anthropic Computer Use — strong for visual workflows; doesn't compose with the rest of the runtime, no per-Employee scoping, no credential vault.

Until now. With /browser, sessions bind to an Employee, credentials persist via /credentials, 2FA codes route via /phone, and the live view is the operator's debug surface.

How /browser works

The workflow is three steps:

Start — ensure_browser_session returns a CDP WebSocket URL. The session is bound to the calling Employee.
Drive — connect Playwright, Puppeteer, or any CDP client. Run your automation. Persistent credentials and 2FA routing happen at the runtime layer.
Stop — stop_browser_session terminates explicitly, or the session times out per its TTL.

Sessions are real Chrome — full DevTools Protocol surface, accurate fingerprints, residential-quality network paths.

Two ways to drive a session: prompt or code

1. Natural language

naive browser run \
  --employee emp_01HXY... \
  --prompt "Log into our Mercury account and download last month's bank statement"

The CLI starts a session, uses /credentials to pull stored Mercury login, navigates and downloads the statement, and stops the session. The result is a Document object on the Company.

2. Code

import { chromium } from "playwright";
 
const session = await fetch("https://api.usenaive.ai/v1/browser/sessions", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.NAIVE_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ employeeId: "emp_01HXY..." }),
}).then((r) => r.json());
 
const browser = await chromium.connectOverCDP(session.cdpUrl);
const page = await browser.contexts()[0].pages()[0];
 
await page.goto("https://mercury.com/login");
 
const creds = await fetch(
  "https://api.usenaive.ai/v1/credentials/mercury_login",
  { headers: { Authorization: `Bearer ${process.env.NAIVE_API_KEY}` } },
).then((r) => r.json());
 
await page.fill("#email", creds.email);
await page.fill("#password", creds.password);
await page.click("button[type=submit]");
 
// 2FA — poll for the code routed via /phone
const { code } = await fetch(
  "https://api.usenaive.ai/v1/phone/sms-code?employeeId=emp_01HXY...",
  { headers: { Authorization: `Bearer ${process.env.NAIVE_API_KEY}` } },
).then((r) => r.json());
await page.fill("#otp", code);
 
// ...do the work...
 
await fetch(`https://api.usenaive.ai/v1/browser/sessions/${session.id}`, {
  method: "DELETE",
  headers: { Authorization: `Bearer ${process.env.NAIVE_API_KEY}` },
});

Watch the browser as it runs

Every session streams a live view back to the operator dashboard. This is the difference between agent automation that works and agent automation you can debug. When the agent gets stuck on a captcha, when a UI changes shape, when a flow stalls — you see it happen in real time.

# Start a session with live view (enabled by default)
naive browser session start --employee emp_01HXY...
 
# Output includes a liveViewUrl — paste into a browser, watch the agent work

The live view is signed and time-limited; only the operator who owns the Company can view. For teams: invite-only links scoped per session.

Persistent credentials, scoped per Employee

Logins are the slowest part of any browser workflow. Re-authing on every run wastes time and ages the account fingerprint. /browser composes with /credentials so cookies, localStorage, and session tokens persist between runs:

# First run: sign in, then snapshot the session
naive browser session start --employee emp_01HXY...
# ...sign in via Playwright or prompt...
naive credentials snapshot --session sess_... --ref stripe_session
 
# Subsequent runs: restore from the snapshot
naive browser session start --employee emp_01HXY... --restore-from stripe_session
# Already signed in.

Persistent credentials are encrypted at rest and decrypted only at session start. Audit logs record every restore.

What you can build with /browser

Operate vendor portals that don't have APIs — Cloud provider billing pages, registrar admin panels, Mercury / Brex statements, government filing sites. Compose /browser with /credentials and /phone for full account-opening flows.

Run end-to-end onboarding for new Employees — Sign up for Mercury, Google Workspace, Slack — every flow that requires SMS verification, captcha solving, or a few minutes of clicking. The onboarding Employee handles it.

Build agent-native scrapers for sites that block bots — Cloudflare-fronted sites, JS-heavy SPAs, captcha-gated forums. The session pool routes around generic anti-bot fingerprints; the runtime tells you when a target is detecting automation.

Audit your own SaaS surface — Have an Employee log into every tool you use, snapshot pricing, billing, and seat counts, and report any drift. Compose with /email so the snapshot lands in your inbox.

Sandbox high-risk experiments visually — When a new agent loop is unproven, run it in a /browser session with the live view open. Watch what it does before letting it run unattended.

Get started

Drop this starter prompt into any coding agent to wire up Naïve:

Read https://usenaive.ai/skill.md and use it to set up Naïve in my project.

Read the docs: usenaive.ai/docs/getting-started/browser
Quickstart: usenaive.ai/docs/getting-started/quickstart
Background reading: Chrome DevTools Protocol and Playwright docs.
Join the community on Discord

Frequently Asked Questions

What is /browser?+

/browser is the Naïve primitive that gives an AI Employee a real, fully-featured Chrome browser session. It supports Playwright, Puppeteer, and CDP-compatible tooling, persists credentials across sessions via /credentials, routes 2FA codes correctly via /phone, and streams a live view back to the operator dashboard for debugging.

How does /browser work?+

Call ensure_browser_session and Naïve provisions a Chrome instance bound to the Employee. Connect via the returned WebSocket URL with Playwright or any CDP client. The session persists for the configured TTL; stop_browser_session terminates it explicitly. Every action is logged.

When should I use /browser vs /integrations?+

Use /integrations when an API exists for what you need (Slack, GitHub, Webflow, Shopify). Use /browser when no API exists — vendor portals, government filings, legacy SaaS, captcha-gated workflows. /browser is the universal fallback; /integrations is the safer, cheaper, faster path when both exist.

How does /browser handle anti-bot detection?+

Sessions use real residential-quality fingerprints, real-Chrome user-agents, and human-like cadences. For high-friction targets (Cloudflare-protected sites, captcha walls), Naïve maintains fingerprint pools that avoid most generic anti-bot signals. Targets that detect any automation will still detect /browser — there's no magic.

How much does /browser cost?+

Sessions are billed per-hour of active time. Live-view streaming is included. See the pricing page for current rates.

What's the difference between /browser and Anthropic Computer Use or Playwright direct?+

Anthropic Computer Use is a model-side interface for visual browser control. Playwright is a client library. /browser composes a managed CDP session with credential persistence (/credentials), 2FA routing (/phone), and Employee identity binding — so the agent can complete real workflows, not just navigate to a URL.

How do I get started with /browser?+

Run naive browser session start --employee emp_01HXY... and Naïve returns a WebSocket URL you can connect to with Playwright. The full quickstart is at usenaive.ai/docs/getting-started/quickstart.

Dennis ZaxCTO

CTO of Naïve. Building the open-source agent runtime.

@denniszax