Claude Context (zilliztech/claude-context) is a Model Context Protocol server that solves the most common Claude Code cost problem: loading entire file trees into context on every request. Instead, Claude Context indexes your codebase into Zilliz Cloud's vector database and uses hybrid search (BM25 + dense vector embeddings) to retrieve only the code relevant to each specific request. The result: ~40% token reduction with equivalent or better retrieval quality. Three packages: claude-context-core (indexing engine), a VSCode extension (semantic code search), and claude-context-mcp (MCP server for Claude Code integration). Works with any codebase size — from small projects to multi-million-line enterprise repos.
Claude Context is Zilliz’s solution to the “full codebase in context” problem that drives up Claude Code costs at scale. The pattern is familiar: every Claude Code request loads a broad set of files as context, even when the actual task touches a small subset of code. At $15/M output tokens for Opus, those wasted tokens add up.
Claude Context turns your codebase into a vector database and serves as a Model Context Protocol server. Each request queries the index with hybrid search and loads only the code actually relevant to the task. Zilliz, the company behind Milvus (the most widely-deployed open-source vector database), brings production-grade vector infrastructure to the problem.
Indexing engine (claude-context-core): Parses your codebase, generates embeddings for code chunks, and stores them in Zilliz Cloud with BM25 keyword indices alongside dense vector indices. Indexing happens once; updates are incremental.
MCP server (claude-context-mcp): Exposes a search_code tool that Claude Code calls automatically when it needs codebase context. Claude asks “find the authentication middleware” and the MCP server returns the three most relevant files instead of the entire auth directory.
VSCode extension: Provides semantic code search in the editor — useful for navigation and review independently of Claude Code integration.
The ~40% reduction comes from replacing broad directory loads with targeted retrieval. A request that previously loaded 50 files to find the relevant 5 now loads 5 files directly. The retrieval quality is maintained or improved because hybrid search (keyword + semantic) is more precise than file-tree heuristics.
# Install the MCP server
npm install -g @zilliz/claude-context-mcp
# Index your codebase (requires Zilliz Cloud account)
npx claude-context-core index --project-path ./
# Add to Claude Code's MCP config
# claude_mcp_config.json:
# { "mcpServers": { "claude-context": { "command": "claude-context-mcp" } } }
Claude Context is most valuable for teams with large codebases (100K+ lines) where Claude Code is already in the daily workflow and token costs are measurable. Smaller projects may not see significant savings relative to the setup overhead. Requires a Zilliz Cloud account for the vector database backend.
Persistent memory layer for AI coding agents — benchmark-backed (95.2% on LongMemEval-S), 92% fewer tokens per session vs full-context pasting, zero manual memory.add() calls.
Open-source AI pair programming tool that works in your terminal to edit code across your entire repository.
AWS's AI-powered coding assistant that helps developers build, deploy, and optimize applications on AWS with code generation and transformation.