About agentmemory

agentmemory (by Rohit G — @rohitg00) is the cleanest example of the validator-wave shift in the skill ecosystem: instead of claiming 'we have memory', it ships the LongMemEval benchmark result table directly in the README. The project hit 3,882 stars in early May 2026 at +754/day GitHub trending velocity, and its README opens with the explicit framing 'based on real-world benchmarks.' Architecturally, agentmemory replaces the built-in 200-line cap of CLAUDE.md and .cursorrules with a persistent memory system that captures every tool use automatically (12 auto-capture hooks, zero manual memory.add() calls) and injects relevant context at session start. The retrieval stack uses BM25 + Vector hybrid search, scoring 95.2% on LongMemEval-S (only 1.4pp behind pure vector at 96.6%), with BM25-alone as a baseline at 86.2%. agentmemory ships dedicated integration folders for Claude Code, OpenClaw, and several other harnesses — explicit cross-harness compatibility is part of the value proposition.

Key Features

Persistent cross-session memory — replaces 200-line CLAUDE.md / .cursorrules cap
12 auto-capture hooks — zero manual memory.add() calls
95.2% retrieval accuracy on LongMemEval-S (ICLR 2025 benchmark)
BM25 + Vector hybrid retrieval (1.4pp behind pure vector at lower token cost)
92% fewer tokens per session vs full-context pasting (README claim)
Cross-harness integrations: Claude Code, OpenClaw, Cursor, others
Captures every tool use automatically — no manual instrumentation
Benchmark table shipped in-repo at /benchmark/LONGMEMEVAL.md

Overview

agentmemory is a persistent memory layer for AI coding agents that frames itself, deliberately, around academic benchmarks rather than around feature lists. The README claims ‘#1 persistent memory for AI coding agents based on real-world benchmarks’ — and unlike most skill-pack READMEs, the benchmark table is actually in the repository, derived from LongMemEval (an ICLR 2025 long-term-memory evaluation suite).

The project sits at the leading edge of what we’re calling the validator wave: the early-May 2026 shift from “ship more skills” to “ship measurable skills.” It hit 3,882 stars at +754/day GitHub trending velocity on 2026-05-10. The relevant comparison is not other memory libraries — it’s the validator-class projects emerging in the same week (react-doctor, claude-doctor) that share the same posture: claim + benchmark + integration.

Key Capabilities

agentmemory replaces the practical 200-line capacity ceiling of CLAUDE.md and .cursorrules with a persistent memory store that:

Captures every tool use automatically via 12 auto-capture hooks — no memory.add() calls in user code
Indexes both lexically (BM25) and semantically (vector embeddings) for hybrid retrieval
Injects the most relevant context at session start, reducing token spend per session by a reported 92% versus full-context pasting

The benchmark suite is the most useful artifact in the repo for evaluating it against alternatives:

Retrieval strategy	LongMemEval-S accuracy
BM25 alone	86.2%
BM25 + Vector hybrid (default)	95.2%
Pure vector	96.6%

The 1.4pp gap between hybrid and pure-vector is the operator-grade detail — hybrid runs at meaningfully lower token cost, and the developer can decide which trade-off matters for their workload.

Why It Matters

The benchmark-in-the-README posture is the new floor for skill primitives. Until early May 2026, “we have memory” was a sufficient claim. The agentmemory contract — claim + benchmark + integration folder — is what we expect the next wave of skill primitives to converge on, because once a benchmark exists in one corner of the ecosystem, the next-most-rational user demands one for the rest.

Best Fit For

Operators running long Claude Code / OpenClaw / Cursor sessions where the 200-line CLAUDE.md cap is the binding constraint
Teams that want measurable retrieval accuracy rather than vibes-based memory claims
Cross-harness operators who need the memory layer to travel with them

Integrations

agentmemory ships dedicated integration folders for Claude Code, OpenClaw, and several other harnesses inside /integrations/. The cross-harness posture matters strategically: a memory layer that’s only on one harness is structurally narrower than a memory layer that travels.

agentmemory

About agentmemory

Key Features

Overview

Key Capabilities

Why It Matters

Best Fit For

Integrations

Similar Agents

Aider

Archon

Amazon Q Developer

agentmemory

About agentmemory

Key Features

Overview

Key Capabilities

Why It Matters

Best Fit For

Integrations

Related Reading

Similar Agents

Aider

Archon

Amazon Q Developer