Cursor SDK vs Browserbase vs OpenAI Apps SDK: 3 Substrates
Three harness substrates embedding into CI/CD this month. We compare use case, distribution, lock-in, and pricing across all three.
In the last five days, the agent-runtime question went from “which model do I use?” to “which substrate do I deploy on?” That is a structural shift, and it is being driven by three near-simultaneous launches that companies will pick between for the next six months of agent embedding work:
- Cursor SDK went into public beta on April 29, exposing the Cursor harness — codebase indexing, MCP servers, skills, hooks, subagents — as a TypeScript runtime that “you can run from CI/CD pipelines, embed in customer-facing products, or wire into kanban-driven ticket-to-PR flows.”
- Browserbase Skills crossed 1.6k stars on GitHub trending with a nine-skill bundle that turns Claude Code into a remote-browser agent — anti-bot stealth, CAPTCHA solving, residential proxies, all addressable as
bbCLI commands the model invokes inline. - OpenAI’s Apps SDK — launched at DevDay 2025 and reinforced by the April 15 Agents SDK update — is the MCP-native runtime where 800M weekly ChatGPT users meet your tool. Pilot partners include Booking.com, Canva, Coursera, Figma, Expedia, Spotify, and Zillow.
The convergence is not coincidence. As Y Combinator’s “Software for Agents” pitch and the Cursor/Browserbase/OpenAI launches all crystallized into one week, the read in today’s convergence cluster is that the harness layer has finished commoditizing as a feature and is now monetizing as a runtime. Pure orchestration is no longer a category. Substrate is.
This piece is the positioning matrix for the three substrates, the lock-in profile each one creates, and the migration path from skills-directory experiments (which we covered in the Cursor 12K-to-200-LoC case study) to production runtime selection. We’re explicitly not recommending one — the right answer depends on where your distribution lives.
What “Harness Substrate” Actually Means in 2026
The terminology has shifted fast. A year ago, “agent harness” meant the while True: agent.step() loop wrapping an API call. Today, that loop is a CLI primitive — Codex /goal, Claude Code’s session loop, Cursor’s runtime — and it ships free with every major coding-agent platform. So if the loop is commodity, what’s the new boundary?
The answer is the runtime substrate — the bundle of (a) the loop, (b) sandboxed compute, (c) authenticated tool access, (d) skill/MCP discovery, (e) distribution surface. A substrate is what you deploy on, not what you build with. And substrates have very different shapes:
- IDE-driven: Cursor SDK runs your agents on the same cloud VMs that power Cursor’s Cloud Agents. Distribution is “engineers already have Cursor open.” Lock-in is the Composer model line and the
.cursor/skills directory. - Browser-native: Browserbase Skills runs your agents inside remote, fingerprint-resistant browser sessions. Distribution is “Claude Code users who installed the plugin.” Lock-in is the Browserbase platform — sessions, contexts, and the Functions runtime.
- ChatGPT-distributed: OpenAI’s Apps SDK runs your MCP server alongside ChatGPT itself. Distribution is “every paying ChatGPT user.” Lock-in is the App Directory and the partnership/review process for inclusion.
Three completely different distribution graphs. Three completely different lock-in profiles. The same underlying primitive — a model that loops, calls tools, and produces output.
Cursor SDK — Engineers as the Distribution
The Cursor SDK launch is the most operationally specific of the three. Available as npm install @cursor/sdk, the SDK exposes agents that get the full Cursor harness: codebase indexing, MCP servers, skills auto-discovery from .cursor/skills/, hooks via .cursor/hooks.json, and subagents that delegate work via named prompts.
The named launch customers — Faire, Rippling, Notion, C3 AI — are not pilot decorations. They are the primary use case in compressed form. Per the launch post, teams at Rippling and Notion are “running agents that pick up Linear or Jira tickets, understand the requirement, generate the implementation, write tests, and open a draft PR for engineer review.” That’s the canonical Cursor SDK shape: an asynchronous CI/CD-adjacent worker that lives in your engineering pipeline and produces draft PRs for human review.
The pricing surface is token-based, not seat-based. Composer 2 Standard runs $0.50 per million input tokens and $2.50 per million output tokens, billed at the API level — not at the seat level. That changes the deployment math materially: you can spin up a thousand parallel agents in CI without a thousand Cursor seats, and you only pay for the work that actually executes. This is the structural advantage Cursor is pressing against the seat-priced incumbents. MarkTechPost’s coverage frames it cleanly: “sandboxed cloud VMs, subagents, hooks, and token-based pricing.”
💡 Lock-in profile: Cursor SDK locks you into the Composer model line and the
.cursor/skills directory. Skills written for Cursor SDK are almost portable to Claude Code (the SKILL.md spec is shared) but the subagent and hook primitives are Cursor-specific. The exit ramp is the cross-runtime layer — see the gstack open-source harness builder and the cc-switch pattern — but you pay an integration cost on the way out.
The signal Cursor is sending with named launch customers and CI/CD examples is that they want to own the engineering-pipeline runtime slot. Not the IDE — they already have that. The pipeline. That’s the new battleground.
Browserbase Skills — The Browser as Tool
Browserbase Skills is the substrate that picked the smallest fight and is winning it cleanly. The repo description is “a set of skills for enabling Claude Code to work with Browserbase through browser automation and the official bb CLI.” Nine skills, all installable via /plugin install browse@browserbase from Claude Code’s marketplace:
| Skill | What it does |
|---|---|
browser | Web interaction via remote Browserbase sessions (anti-bot stealth, residential proxies, CAPTCHA solving) |
browserbase-cli | Direct platform API — sessions, projects, contexts, functions |
functions | Serverless browser deployment to Browserbase cloud |
site-debugger | Diagnostics for failing automations — bot detection, selectors, timing, auth, captchas |
browser-trace | DevTools protocol capture with searchable per-page buckets |
bb-usage | Terminal dashboard for usage stats and cost forecasts |
cookie-sync | Chrome-to-context cookie sync for authenticated access |
fetch | Static page retrieval without browser sessions |
search | Web search with structured results |
The strategic move here is subtle: Browserbase did not try to be a coding-agent runtime. It built tools for the agents that already exist. By deploying as a Claude Code plugin instead of a competing harness, Browserbase made every Claude Code user a potential customer without having to win them away from Anthropic’s distribution. The substrate they’re competing on is browser automation as a service — anti-bot fingerprints, session replay, captcha resolution, agent identity. None of those are things Anthropic is going to ship, and few are things Cursor is going to ship.
The companion product, Stagehand, is the open-source SDK for browser agents — the layer that lets you write page.act("click the cart button") and have the model handle the selector logic. The Browserbase business model is the cloud runtime underneath: scale-out browser sessions, residential proxies, function-style deployments. Open source on top, paid runtime underneath — the Browserbase internal-agents post walks through the same pattern as the production case study.
💡 Lock-in profile: Browserbase Skills is the lowest-lock-in of the three. Stagehand is open source; you can self-host browser sessions; you can swap the Claude harness underneath. The lock-in lives in Browserbase’s paid runtime — anti-bot evasion, fingerprint randomization, residential proxy pools — which is genuinely hard to replicate in-house. If your agents need to log into customer SaaS to scrape or operate, Browserbase’s runtime is the default. If they only need to read public pages, you can leave any time.
The shape this takes in production: any agent that needs to operate inside a browser session — booking, e-commerce, internal-tool automation, support-ticket resolution — uses Browserbase as the tool layer and Cursor SDK / Claude Code / OpenAI Agents SDK as the orchestration layer. It is not an either-or with the other two substrates. It is a complement.
OpenAI Apps SDK — Distribution as the Lock-In
OpenAI’s Apps SDK is the substrate with the largest distribution graph and the heaviest gravity. Launched at DevDay 2025 (October 6) and reinforced by the April 15 Agents SDK update, the architecture is simple in principle:
- You ship an MCP server that defines your tools and capabilities.
- You optionally ship a web component rendered as an iframe inside ChatGPT.
- The Apps SDK handles discovery, authentication, billing surfaces, and rendering.
- Your app runs alongside ChatGPT itself, available to every ChatGPT user as a
/your-appmention.
The April 15 update added Sandbox Agents as a runtime layer — a SandboxAgent / Manifest / SandboxRunConfig stack with backends for local development (UnixLocalSandboxClient, DockerSandboxClient) and hosted partner integrations for Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel. That partner list is the tell. OpenAI is not building the compute layer — they are integrating with everyone who already does. The Apps SDK is the distribution layer; the partner runtimes are the substrate underneath.
Pilot apps in the launch wave: Booking.com, Canva, Coursera, Figma, Expedia, Spotify, Zillow. Notice the pattern — these are not developer tools. They are consumer destinations. The shape OpenAI is pushing is that ChatGPT becomes the aggregator and your existing consumer surface becomes a tool inside it. The agent is the chat. Your app is the back-end.
💡 Lock-in profile: Apps SDK is the highest-lock-in of the three. Your distribution is gated by ChatGPT’s directory inclusion and review process. Your runtime is structurally biased toward partners OpenAI has integrated with. Pricing is tied to the App Directory monetization rules (still emerging — see the Render field guide for the current state of in-app billing). The exit ramp is the MCP standard itself: because everything is built on MCP, your server can theoretically be exposed to other harnesses. In practice, the integration surface — auth, billing, UX — is where the lock-in concentrates.
The three substrates are not direct competitors. They are competing for different ends of the same value chain: Cursor for the engineering pipeline runtime, Browserbase for the browser-tool runtime, OpenAI Apps SDK for the consumer-distribution runtime.
The Positioning Matrix
Here’s the comparison most teams need:
| Dimension | Cursor SDK | Browserbase Skills | OpenAI Apps SDK |
|---|---|---|---|
| Primary use case | CI/CD-adjacent code agents (ticket → PR) | Browser-tool layer for agents | ChatGPT-distributed apps |
| Distribution surface | Engineers already on Cursor | Claude Code marketplace | 800M weekly ChatGPT users |
| Runtime model | Cloud VMs sandboxed per agent | Remote browser sessions | MCP server + iframe component |
| Pricing surface | Token-based ($0.50/M in, $2.50/M out for Composer 2 Standard) | Per-session / per-page (Browserbase platform) | Likely revenue share + token (App Directory rules emerging) |
| Lock-in level | Medium (skills portable, subagents/hooks not) | Low (open-source SDK, paid runtime is replaceable in principle) | High (directory inclusion gates distribution) |
| Best when | You ship code and want async agents in CI | Your agent has to operate a logged-in browser | Your value is consumer-side and you want ChatGPT distribution |
| Worst when | You don’t already have engineers on Cursor | Your agent doesn’t touch a browser | You can’t accept the directory/review surface |
| Named customers | Faire, Rippling, Notion, C3 AI | Claude Code marketplace users (1.6k stars) | Booking.com, Canva, Coursera, Figma, Expedia, Spotify, Zillow |
The pattern that emerges when you read the matrix horizontally: most production agent stacks in late 2026 will use two of the three. A Cursor SDK code agent that calls Browserbase Skills for browser-side QA. An OpenAI Apps SDK consumer app whose back-end operates a Browserbase-hosted browser to fulfill orders. A Cursor-built internal agent that surfaces in ChatGPT via the Apps SDK MCP-export path.
This is the same pattern AWS / GCP / Azure formed in 2014–2018: not winner-take-all, but each cloud occupying a defensible niche while being interoperable enough to combine. The agent-runtime layer is hitting the same equilibrium five years faster.
Migration Paths and Exit Ramps
The strategic question every team is asking right now is how reversible is this choice. The honest answers:
- Cursor SDK → Claude Code / Codex: Your skills (
SKILL.mdfiles) port cleanly. Your subagent and hook configs do not. Plan for ~1–2 weeks of adapter work per major flow. The cross-runtime tools at cc-switch help. - Browserbase Skills → self-hosted Playwright: Drop-in for the technical surface. The hard part is replicating the anti-bot, residential-proxy, and session-replay infrastructure that makes Browserbase’s runtime defensible. Stagehand is open-source; the cloud runtime is what you’re buying.
- OpenAI Apps SDK → another MCP host: Theoretically clean (MCP is an open standard). Practically expensive (you lose the ChatGPT distribution graph that was the whole point). The exit is more likely to be “supplement Apps SDK with your own native channel” rather than “leave Apps SDK.”
The convergence-report take from Radar frames the macro reality: “Anyone shipping pure orchestration without distribution should be selling, not raising.” That’s the exit-ramp signal. Pure-orchestration startups will get acquired or absorbed; substrate-and-distribution plays will keep raising. The three SDKs we just compared are all on the latter side of that line.
💡 The one decision that matters most: which distribution graph are you closest to? If your users are engineers, Cursor. If your agent has to run inside an authenticated browser, Browserbase (composed with one of the others). If your value lives on the consumer side, OpenAI Apps SDK. Pick that first. The runtime details follow.
Conclusion: Substrate Selection Is the New Cloud Selection
Six months ago, the question was “Anthropic or OpenAI for the model?” Three months ago, it was “Claude Code or Codex for the harness?” Today, it’s “Cursor SDK, Browserbase Skills, or OpenAI Apps SDK for the runtime substrate?”
That ladder of abstraction is exactly the same one the cloud business climbed in 2010–2014: from “which servers” to “which managed services” to “which platform-of-platforms.” The people who picked AWS in 2012 are not regretting it; the people who built on Heroku because it was the easiest path are. The same shape is forming now in the agent layer, and the substrate choices made in May 2026 will look in retrospect like the platform-of-platforms commitments of the next cycle.
The migration math is real but not destiny. MCP being the open standard underneath all three substrates means the worst-case migration cost is bounded. The named customers (Faire, Rippling, Notion, Booking.com, Figma, Spotify) are voting with their integration budgets. And the YC / Karpathy / Naval thesis convergence we tracked in today’s convergence report confirms that the runtime layer — not the model layer, not the harness layer — is the next 12-month battleground.
Pick by distribution. Hedge by composition. Run the matrix on your own use cases this week, and don’t expect any single substrate to win all three columns.



