VectifyAI/PageIndex is the highest-profile vectorless RAG implementation as of May 2026, with 29.7K GitHub stars and +953 in a single day on May 7. The pitch is structural: no embeddings, no chunking, no vector DB. Instead, an LLM reads the document end-to-end and emits a table-of-contents-style tree. At query time, the LLM walks that tree, expanding promising branches and ignoring irrelevant ones, returning the specific section(s) that answer the question. The headline benchmark is 98.7% accuracy on FinanceBench — territory where vector RAG typically lands at 70-85%. PageIndex's strongest use cases are long structured documents (financial filings, legal contracts, regulatory filings, policy docs), where 'similar but wrong' is the dominant failure mode of embedding retrieval. For multi-document or noisy corpora, vector RAG still scales better. Available as the open-source library, a hosted cloud service at pageindex.ai, and an MCP server (pageindex-mcp) that slots into any agent harness — Claude Code, Cursor SDK, openclaw — as a tool. MIT licensed.
PageIndex represents the cleanest implementation of the vectorless RAG pattern that emerged in 2025–2026 as agent operators hit retrieval-quality walls with embedding-based pipelines. Instead of chunking documents into ~500-token windows, embedding each chunk, and running cosine similarity at query time, PageIndex performs two LLM-mediated steps:
The retrieval is reasoning, not similarity. That distinction is load-bearing for the workloads where PageIndex wins — long structured documents where similar text appears in many places (older versions of the same policy, deprecated APIs, discussion of the feature without the spec) and “similar but wrong” is the dominant failure mode of vector retrieval.
For the broader decision framework on when to pick PageIndex vs an embedding-based stack, see Vectorless RAG: PageIndex vs Embedding RAG Decision Guide.
The vectorless RAG framing is one of three convergent signals on agent retrieval this quarter: Microsoft’s TechCommunity arguing for reasoning-based retrieval, DigitalOcean publishing a “Beyond Vector Databases” tutorial, and LlamaIndex’s own “RAG is dead, long live agentic retrieval” essay. PageIndex sits at the open-source center of that conversation.
The cost economics also reinforce it: tree search costs more LLM calls per query, but with DeepSeek V4 Flash at $0.14/M input tokens, the marginal cost of a deeper, more correct retrieval is essentially zero. The cheaper inference gets, the better vectorless RAG looks.
The fastest agent integration is the pageindex-mcp server, which exposes PageIndex as a Model Context Protocol tool. Any MCP-aware harness (Claude Code, Cursor, openclaw) can call it the same way it calls any other tool — no SDK rewrite, no architecture overhaul.
For Python-direct integration:
from pageindex import PageIndex
index = PageIndex.from_pdf("10K-2025.pdf")
result = index.query("Item 1A risk factor coverage on cybersecurity 2024 vs 2025")
print(result.cited_pages) AI agents that work well with PageIndex.