AgentConn
C

context-mode

DevOps Free

About context-mode

context-mode (mksglu/context-mode) is a CLI proxy and harness extension that sandboxes the output of agent tool calls — file reads, shell commands, search results — and re-presents only the relevant slice to the LLM, dropping the raw full output from the conversation context. Empirically, this cuts agent token spend 60-98% on multi-step coding loops without changing the agent's decision quality. context-mode trended on GitHub the same week as rtk-ai/rtk, which targets the same problem with similar numbers; the two projects are converging on what looks like a new standard category — context optimization as a separate runtime concern, distinct from the model and the harness. For any team paying for agent-loop tokens at scale, context-mode is the cleanest immediate cost-cutting lever short of swapping the model entirely.

Key Features

  • 60-98% context-spend reduction on multi-step agent loops
  • Tool-output sandboxing — full output stays out of LLM context, only relevant slice surfaces
  • CLI proxy mode works with Claude Code, Cursor, Codex out of the box
  • Compatible with DeepClaude / Anthropic-compatible endpoints — pairs with cost-cutting model swaps
  • Decision quality unchanged in author's benchmarks
  • Open source, configurable redaction policies
  • Lightweight — proxy adds <50ms per tool call
  • Same-week trending peer of rtk-ai/rtk on the same thesis

Overview

context-mode targets the dirty secret of long agent loops: most of the context budget is spent re-presenting the output of tool calls — file reads that produced 800 lines, shell commands that spilled stack traces, search results with 40 hits when the agent only acted on one. context-mode sandboxes that output, surfaces only the relevant slice to the LLM, and lets the rest stay out of context.

The empirical claim is 60-98% context-spend reduction on real coding workflows, with no measurable degradation in agent decision quality. That’s a large enough delta to be worth a serious test on any team paying agent-loop bills at scale.

Why It Matters

By Q2 2026, context optimization has emerged as a distinct runtime category, separate from the model layer and the harness layer. context-mode and rtk-ai/rtk trended in the same week with overlapping pitches, which is the GitHub-trending pattern that usually precedes a category consolidation.

The economic case stacks with the DeepClaude shim: if DeepClaude cuts the per-token cost ~17×, and context-mode cuts the token count 60-98%, the combined effect is a 50-1000× reduction in agent-loop spend versus stock Claude Code on Anthropic API. That’s a real magnitude shift, not a marginal optimization.

How It Works

  • Tool-call interception. context-mode proxies the agent harness’s tool-call layer. When the agent invokes read_file, bash, grep, or similar, the full output goes to context-mode’s sandbox.
  • Selective surfacing. A second-pass prompt or rule-based selector identifies the relevant slice (lines containing the diff target, the error from the stack trace, the matching grep hits). Only that slice gets returned to the agent’s conversation.
  • Audit trail. The full output is logged to disk with a content-addressable hash. The agent can request the full version on demand if its first pass missed.

Use Cases

  • Heavy agent loops on Anthropic API where token spend dominates the bill.
  • Multi-file refactors that read tens of files; usually only a handful of lines per file matter.
  • Shell-driven workflows where command output is verbose (build logs, test output, package-manager noise).
  • Cost-sensitive teams running on DeepClaude or other discount endpoints, stacking the savings.

Pairs With

  • DeepClaude — multiplicative cost reduction.
  • rtk — competing project on the same thesis; benchmark both.
  • jcode, Claude Code, Cursor, Codex — the harnesses context-mode plugs into.

Verdict

If you’re spending more than $200/month on Claude API tokens for agent loops, run context-mode for a week and measure. The author’s claim is large enough that even a fractional improvement justifies the install cost; the rtk peer-project’s overlapping numbers suggest the underlying technique is real. Expect this category to consolidate into one or two winners by Q3 2026 — context-mode is one of the two front-runners.

Similar Agents