AgentConn

PageIndex

Framework Agnostic Intermediate Memory & RAG Open Source

VectifyAI/PageIndex is the highest-profile vectorless RAG implementation as of May 2026, with 29.7K GitHub stars and +953 in a single day on May 7. The pitch is structural: no embeddings, no chunking, no vector DB. Instead, an LLM reads the document end-to-end and emits a table-of-contents-style tree. At query time, the LLM walks that tree, expanding promising branches and ignoring irrelevant ones, returning the specific section(s) that answer the question. The headline benchmark is 98.7% accuracy on FinanceBench — territory where vector RAG typically lands at 70-85%. PageIndex's strongest use cases are long structured documents (financial filings, legal contracts, regulatory filings, policy docs), where 'similar but wrong' is the dominant failure mode of embedding retrieval. For multi-document or noisy corpora, vector RAG still scales better. Available as the open-source library, a hosted cloud service at pageindex.ai, and an MCP server (pageindex-mcp) that slots into any agent harness — Claude Code, Cursor SDK, openclaw — as a tool. MIT licensed.

Input / Output

Accepts

pdf markdown long-document

Produces

cited-section tree-index

Overview

PageIndex represents the cleanest implementation of the vectorless RAG pattern that emerged in 2025–2026 as agent operators hit retrieval-quality walls with embedding-based pipelines. Instead of chunking documents into ~500-token windows, embedding each chunk, and running cosine similarity at query time, PageIndex performs two LLM-mediated steps:

  1. Index time: the model reads the document and emits a hierarchical table-of-contents tree — sections, subsections, page ranges, optional summaries.
  2. Query time: the model walks that tree, expanding promising branches and pruning the rest, ultimately returning the specific section(s) containing the answer.

The retrieval is reasoning, not similarity. That distinction is load-bearing for the workloads where PageIndex wins — long structured documents where similar text appears in many places (older versions of the same policy, deprecated APIs, discussion of the feature without the spec) and “similar but wrong” is the dominant failure mode of vector retrieval.

For the broader decision framework on when to pick PageIndex vs an embedding-based stack, see Vectorless RAG: PageIndex vs Embedding RAG Decision Guide.

Why It Matters in 2026

The vectorless RAG framing is one of three convergent signals on agent retrieval this quarter: Microsoft’s TechCommunity arguing for reasoning-based retrieval, DigitalOcean publishing a “Beyond Vector Databases” tutorial, and LlamaIndex’s own “RAG is dead, long live agentic retrieval” essay. PageIndex sits at the open-source center of that conversation.

The cost economics also reinforce it: tree search costs more LLM calls per query, but with DeepSeek V4 Flash at $0.14/M input tokens, the marginal cost of a deeper, more correct retrieval is essentially zero. The cheaper inference gets, the better vectorless RAG looks.

When to Use

  • Long structured documents (10-K filings, regulatory filings, contracts, medical guidelines, internal policy)
  • Multi-hop retrieval where one section references another
  • Workloads where citation auditing (page numbers, section IDs) is required
  • Domain-specific Q&A where vector retrieval routinely returns the wrong-but-similar chunk

When Not to Use

  • Generic semantic search across many short documents
  • Sub-second latency budgets (tree search is sequential and slower than vector lookup)
  • Multi-document corpora with high noise — vector RAG scales better there
  • Production-critical workloads — the open-source PageIndex implementation is beta-status; for hardened production use, the hosted pageindex.ai service or on-prem option is the answer

Integration

The fastest agent integration is the pageindex-mcp server, which exposes PageIndex as a Model Context Protocol tool. Any MCP-aware harness (Claude Code, Cursor, openclaw) can call it the same way it calls any other tool — no SDK rewrite, no architecture overhaul.

For Python-direct integration:

from pageindex import PageIndex
index = PageIndex.from_pdf("10K-2025.pdf")
result = index.query("Item 1A risk factor coverage on cybersecurity 2024 vs 2025")
print(result.cited_pages)

Tags

#rag #vectorless-rag #retrieval #reasoning #tree-search #pageindex #vectifyai #long-documents #financebench

Compatible Agents

AI agents that work well with PageIndex.

Similar Skills