Harness Wars: Who Owns Your Coding Agent?

Interconnected nodes representing different coding agents linked through a central orchestration hub — the harness wars visualized

On May 31, 2026, five of the top twenty trending repositories on GitHub are orchestration tools. Not models. Not agents. Not frameworks for building agents. Tools for managing agents you already have — switching between them, sandboxing them, multiplexing them in split terminals, and generating the skill architectures they run on.

CC Switch crossed 86,800 stars — a desktop app that lets you swap between Claude Code, Codex, Gemini CLI, OpenCode, and OpenClaw from one interface. RevFactory’s Harness is gaining 318 stars per day — a meta-skill that designs agent teams from six battle-tested architecture patterns. Matt Pocock’s Sandcastle orchestrates sandboxed coding agents in Docker. Herdr multiplexes agents in tiled terminal panes. Hermes WebUI provides a web-based mission control for agent fleets.

Garry Tan — the Y Combinator CEO whose gstack harness collected 82K stars in two months — named the frame: harness wars. His strategic observation is the thesis of this article: someone else’s harness shouldn’t lock you in.

💡 The moat is visibly migrating from the model to the orchestration layer. The agent that runs your code matters less than the system that decides which agent runs your code, how it’s sandboxed, and whether you can swap it out tomorrow.

The new stack: what’s actually shipping

The harness wars aren’t hypothetical. They’re shipping production code this week. Here’s the taxonomy of what’s trending and what each layer does.

Layer 1: Switchers — CC Switch and the portability thesis

CC Switch exists for one reason: developers don’t trust any single provider to own their workflow. The app manages five CLI tools from one interface — provider switching, MCP server configurations, skill synchronization, usage tracking, and cost analytics. It added 50+ provider presets with hot-switching and auto-failover. Lightweight Mode keeps it running from the system tray.

The 86,800 stars aren’t vanity metrics. They measure a specific anxiety: developers who’ve invested weeks of CLAUDE.md files, MCP configurations, and workflow muscle memory into one agent want insurance that they can move. When Anthropic changes pricing, when Codex ships a new feature, when Gemini CLI gets better at a specific language — the cost of switching should be a button click, not a weekend of reconfiguration.

This is the portability thesis, and it’s the single biggest structural insight in the harness wars: whoever owns the orchestration layer owns the developer. CC Switch’s bet is that nobody should own it.

Layer 2: Sandboxers — Sandcastle and isolation

Matt Pocock’s Sandcastle solves a different problem: what happens when agents need to modify your codebase in parallel without stepping on each other?

The answer is Docker-based sandboxing. Each agent gets its own container. Commits made inside the sandbox get patched back to the host. The system is 100% offline — no cloud dependency. You invoke agents with a single sandcastle.run() call, and Sandcastle handles branch strategies and isolation. It ships with built-in providers for Docker, Podman, and Vercel, and you can write your own.

This matters because Claude Code’s Dynamic Workflows can now spawn up to 1,000 subagents. A developer recently reported that ultracode spun up 85 agents in parallel for nearly 16 minutes from a single prompt. When you’re running that many agents, isolation isn’t a nice-to-have. It’s the difference between a clean merge and a catastrophic conflict.

Sandcastle is provider-agnostic by design. It doesn’t care whether the agent inside the sandbox is Claude Code, Codex, or OpenCode. The sandbox is the harness.

Layer 3: Multiplexers — Herdr and terminal-native orchestration

Herdr takes a different approach entirely. No GUI. No Electron. No Mac-only wrapper. Herdr is a terminal-native multiplexer written in Rust that lets you supervise multiple agents in tiled panes with workspaces, tabs, and mouse-native interaction.

Every agent is visible at a glance — blocked, working, or done. Sessions can be detached and reattached with agents continuing to run (like tmux for AI agents). Built-in integrations cover Claude Code, Codex, OpenCode, Hermes, and more — each one forwarding semantic state to Herdr over its socket API.

The design philosophy is revealing: you see the agent’s own terminal, not someone’s interpretation of it. This is the anti-lock-in position expressed as architecture. If the multiplexer doesn’t abstract away the agent’s native interface, switching agents is trivial — you just open a new pane.

Layer 4: Meta-skills — RevFactory Harness and team architecture

RevFactory’s Harness operates one level higher: it doesn’t orchestrate agents. It designs the orchestration. You describe your domain, and Harness generates a complete agent team — 4–5 specialist agents, an orchestrator skill, and domain-specific skills — from six pre-defined architecture patterns: Pipeline, Fan-out/Fan-in, Expert Pool, Producer-Reviewer, Supervisor, and Hierarchical Delegation.

The project reports +60% average quality improvement in author-measured A/B testing. Its companion project, harness-100, ships 100 production-ready agent team harnesses across 10 domains.

This is the most ambitious layer of the stack because it implies that agent orchestration is itself an automatable task. If Harness can generate good team architectures, then the moat isn’t in knowing how to orchestrate agents — it’s in knowing which pattern fits your domain. The meta-skill becomes the competitive advantage.

⚠️ 76–81% of surveyed enterprises express concern over proprietary dependencies in agent memory, model integration, and orchestration tooling. The lock-in risk is real, and the market knows it.

Why this matters: the Latent Space thesis

The Latent Space podcast crystallized what’s happening in a single frame: The Age of Async Agents. The first wave was copilots — AI inside your editor. The second wave was agents — AI running your terminal. The third wave, happening now, is fleets — multiple agents working in parallel on different parts of the same problem, supervised but not micromanaged.

Cursor’s CEO Michael Truell described the shift: “Cursor is no longer primarily about writing code. It is about helping developers build the factory that creates their software. This factory is made up of fleets of agents that they interact with as teammates.”

When agents are teammates, the harness is the org chart. And just like real org charts, the question of who controls the structure matters enormously.

The lock-in vectors

Where exactly does lock-in happen in the harness stack? There are four vectors, each corresponding to a layer where a vendor can trap you:

1. Configuration lock-in. Your CLAUDE.md files, .cursorrules, and agent-specific configuration represent weeks of accumulated knowledge about your codebase. CC Switch addresses this with cross-app skill synchronization, but the formats aren’t standardized. A skill written for Claude Code won’t run natively on Codex without translation.

2. Memory lock-in. Agent memory — conversation history, learned preferences, codebase context — is the most underappreciated lock-in vector. When your agent has 500 sessions of context about your repo, switching means starting from zero. The tokenmaxxing pattern makes this worse: the more tokens you’ve burned building context, the higher the switching cost.

3. Orchestration lock-in. Dynamic Workflows are Claude Code-specific. If you’ve built complex multi-agent pipelines using Claude’s workflow system, those pipelines don’t port to Codex or Gemini. Sandcastle addresses this by keeping orchestration in your code (TypeScript), not in the vendor’s platform.

4. Tooling lock-in. MCP servers are theoretically portable — the protocol is open. But in practice, each agent has different MCP support depth, different tool calling patterns, and different permission models. A workflow that works perfectly on Claude Code’s MCP implementation may fail silently on another agent’s partial MCP support.

The playbook: what to bet on

If the moat is migrating from models to harnesses, the builder’s playbook is straightforward:

Use switchers early. Even if you’re all-in on Claude Code today, install CC Switch. The cost is zero and the option value is enormous. When pricing changes or a competitor ships a better feature, you want the ability to move in minutes, not days.

Sandbox by default. If you’re running more than one agent on the same codebase — which Dynamic Workflows and ultracode now make trivially easy — use Sandcastle or equivalent isolation. The alternative is debugging merge conflicts created by 85 agents that all touched the same files.

Keep orchestration in code. The most dangerous lock-in is the kind you don’t see coming. If your agent orchestration lives in a vendor’s proprietary system, you’re locked in the moment the system becomes load-bearing. Keep your workflows in TypeScript (Sandcastle), in shell scripts, or in CLAUDE.md skills that can be ported. The multica pattern is instructive here: orchestration that lives in your repo is orchestration you own.

Watch the standards. MCP is now implemented on over 10,000 enterprise servers with 97 million SDK downloads. It’s becoming the universal agent-to-tool interface. Bet on tools that speak MCP natively and avoid tools that invent proprietary alternatives.

💡 2025 was agents. 2026 is agent harnesses. The competitive advantage comes from infrastructure, not intelligence. The question isn’t “which model is best?” anymore — it’s “which orchestration layer gives me the most portability?”

What’s next

The harness wars are accelerating because the economics demand it. When one Uber engineer can burn through an entire year’s AI budget in four months — as Microsoft’s data revealed — the ability to switch providers on cost alone justifies the entire switcher layer. When ultracode can spawn 85 parallel agents from a single prompt, the isolation and multiplexing layers become structural requirements.

The next frontier is likely harness interoperability — a standard format for agent configurations, memory, and orchestration patterns that works across all providers. CC Switch’s cross-app synchronization is a first step. Sandcastle’s provider-agnostic sandbox is another. But we’re still in the “each tool invents its own format” phase.

Garry Tan’s observation remains the sharpest framing: someone else’s harness shouldn’t lock you in. The tools shipping this week — the switchers, the sandboxers, the multiplexers, the meta-skills — all exist because the market agrees. The question is whether the infrastructure will consolidate around open standards or fragment into competing ecosystems.

If history is any guide, the answer is both — simultaneously and messily. Build accordingly.