About context-mode

context-mode (mksglu/context-mode) is a CLI proxy and harness extension that sandboxes the output of agent tool calls — file reads, shell commands, search results — and re-presents only the relevant slice to the LLM, dropping the raw full output from the conversation context. Empirically, this cuts agent token spend 60-98% on multi-step coding loops without changing the agent's decision quality. context-mode trended on GitHub the same week as rtk-ai/rtk, which targets the same problem with similar numbers; the two projects are converging on what looks like a new standard category — context optimization as a separate runtime concern, distinct from the model and the harness. For any team paying for agent-loop tokens at scale, context-mode is the cleanest immediate cost-cutting lever short of swapping the model entirely.

Key Features

60-98% context-spend reduction on multi-step agent loops
Tool-output sandboxing — full output stays out of LLM context, only relevant slice surfaces
CLI proxy mode works with Claude Code, Cursor, Codex out of the box
Compatible with DeepClaude / Anthropic-compatible endpoints — pairs with cost-cutting model swaps
Decision quality unchanged in author's benchmarks
Open source, configurable redaction policies
Lightweight — proxy adds <50ms per tool call
Same-week trending peer of rtk-ai/rtk on the same thesis

Overview

context-mode targets the dirty secret of long agent loops: most of the context budget is spent re-presenting the output of tool calls — file reads that produced 800 lines, shell commands that spilled stack traces, search results with 40 hits when the agent only acted on one. context-mode sandboxes that output, surfaces only the relevant slice to the LLM, and lets the rest stay out of context.

The empirical claim is 60-98% context-spend reduction on real coding workflows, with no measurable degradation in agent decision quality. That’s a large enough delta to be worth a serious test on any team paying agent-loop bills at scale.

Why It Matters

By Q2 2026, context optimization has emerged as a distinct runtime category, separate from the model layer and the harness layer. context-mode and rtk-ai/rtk trended in the same week with overlapping pitches, which is the GitHub-trending pattern that usually precedes a category consolidation.

The economic case stacks with the DeepClaude shim: if DeepClaude cuts the per-token cost ~17×, and context-mode cuts the token count 60-98%, the combined effect is a 50-1000× reduction in agent-loop spend versus stock Claude Code on Anthropic API. That’s a real magnitude shift, not a marginal optimization.

How It Works

Tool-call interception. context-mode proxies the agent harness’s tool-call layer. When the agent invokes read_file, bash, grep, or similar, the full output goes to context-mode’s sandbox.
Selective surfacing. A second-pass prompt or rule-based selector identifies the relevant slice (lines containing the diff target, the error from the stack trace, the matching grep hits). Only that slice gets returned to the agent’s conversation.
Audit trail. The full output is logged to disk with a content-addressable hash. The agent can request the full version on demand if its first pass missed.

Use Cases

Heavy agent loops on Anthropic API where token spend dominates the bill.
Multi-file refactors that read tens of files; usually only a handful of lines per file matter.
Shell-driven workflows where command output is verbose (build logs, test output, package-manager noise).
Cost-sensitive teams running on DeepClaude or other discount endpoints, stacking the savings.

Pairs With

DeepClaude — multiplicative cost reduction.
rtk — competing project on the same thesis; benchmark both.
jcode, Claude Code, Cursor, Codex — the harnesses context-mode plugs into.

Verdict

If you’re spending more than $200/month on Claude API tokens for agent loops, run context-mode for a week and measure. The author’s claim is large enough that even a fractional improvement justifies the install cost; the rtk peer-project’s overlapping numbers suggest the underlying technique is real. Expect this category to consolidate into one or two winners by Q3 2026 — context-mode is one of the two front-runners.

context-mode

About context-mode

Key Features

Overview

Why It Matters

How It Works

Use Cases

Pairs With

Verdict

Similar Agents

Datadog AI

opensre

PagerDuty AIOps