PageIndex skips vectors and lets the LLM reason over a TOC tree to find relevant chunks

Most RAG pipelines look the same: chunk the document, embed each chunk, store the vectors, then at query time embed the question and retrieve the top-k most cosine-similar chunks. PageIndex is doing something different. Instead of vectors, it builds a hierarchical tree index that mirrors the document's table of contents — each node has a title, a summary, and a pointer to the full section text. Retrieval is not a similarity search. It's the LLM reading the tree and reasoning about which nodes to descend into, then fetching the full text only from the matched ones.

The architectural bet is that "similarity is a weak proxy for relevance" once your queries get complex. If you ask a financial report "why did the authors choose self-attention over recurrence, and what are the complexity trade-offs," vector search has to find chunks whose embedding looks like the question — which can miss the cross-section reasoning entirely. PageIndex's example walks through this on the original Transformer paper: the LLM looks at the tree, identifies that the motivation section AND the complexity analysis section are both relevant, fetches both, and grounds the answer in the right places. That kind of multi-section synthesis is where vector RAG quietly fails and the user just gets a confident wrong answer.

The honest trade-off is cost and structure. Every retrieval round requires the LLM to read a tree summary and reason about which nodes to open — that's an LLM call per query, on top of the generation call. Vector search is essentially free per query once the index is built; PageIndex turns retrieval itself into a model invocation. If your documents are flat PDFs scraped from the web with no clear hierarchy, the tree won't help. If your queries are simple lookups ("what's the warranty period"), the reasoning overhead is wasted. The article from PageIndex doesn't show benchmark numbers against vector baselines on FinanceBench, doesn't discuss latency or how the index handles document updates, and doesn't compare cost per query. Those are the questions any serious evaluation needs.

For builders working on long structured documents — financial filings, legal contracts, research papers, technical manuals — PageIndex is worth a real test against your existing vector setup. The interpretability alone is valuable: when retrieval fails, you can see the reasoning chain that picked the wrong nodes, instead of staring at opaque cosine scores. For everyone else doing chatbot-over-help-docs work, the existing embedding stack is probably still the right answer. The interesting frontier here is not "vectors versus reasoning" as a religious war, but knowing which document shape and query pattern justifies which architecture. PageIndex is one shape; vectors are another.

PageIndex skips vectors and lets the LLM reason over a TOC tree to find relevant chunks

More News