One Search Surface: Teaching voitta-rag to Speak Architecture

Back in February, I wrote that llm-tldr and voitta-rag were complementary. One builds a map of a codebase through static analysis. The other retrieves the actual code you need. My conclusion then was basically: great, wire both into the agent and let it choose.

That works, but it still leaves the agent doing tool-routing. It has to know that one question wants architecture and another wants source. It has to bounce between surfaces. So we collapsed the distinction.

voitta-rag can now index llm-tldr‘s static-analysis output as companion documents alongside the raw code chunks it already stores for Git sources. Turn on the new gh_llm_tldr flag for a repo, sync it, and the same search surface now returns two different kinds of context:

  • raw code chunks for the implementation itself, and
  • structural analysis chunks describing callers, callees, imports, signatures, and relationships.

One query. One index. No “which tool should I call?” moment.

The old split was clean, but inconvenient

The original split between the two tools made conceptual sense.

llm-tldr is good at questions like:

  • What calls this function?
  • What depends on this module?
  • Where does this piece of data flow?
  • What parts of the codebase are structurally central?

voitta-rag is good at questions like:

  • Show me the implementation of token verification.
  • Find the code that handles OAuth callbacks.
  • Search across this repo, that wiki, and those tickets.
  • Give me the actual file I need to edit.

That’s a nice division of labor for a human. It is less nice for an agent, because agents do not merely need information; they need the right shape of information without extra orchestration. The more routing logic you make them do, the more failure modes you introduce.

The latest voitta-rag implementation removes that choice entirely. Static analysis stops being a separate destination and becomes part of retrieval.

What actually shipped

When a Git source has gh_llm_tldr enabled, sync now runs llm-tldr over each supported source file and stores the results in the same Qdrant collection as the ordinary code chunks.

Those analysis chunks are tagged as source_type="llm-tldr-analysis" and linked back to their origin file with related_file. That sounds like plumbing, and it is, but it matters: the search layer now knows that an analysis chunk about verify_token() belongs to a specific source file rather than floating around as a free-standing summary.

The first proof of concept indexed file-level summaries. The more interesting version goes further: it now stores one overview chunk per file plus one chunk per top-level function and class method. Each function-level chunk can carry structured payload fields such as:

  • function name
  • class name
  • callers
  • callees
  • caller count
  • callee count
  • imports

That means this is not just “RAG, but with bigger summaries.” The call graph is queryable metadata now. You can filter for things like “functions with more than five callers” or “functions importing module X” without standing up a separate graph database just to answer what are, in practice, glorified indexing questions.

GitNexus

GitNexus is interesting, but it is licensed under PolyForm Noncommercial. That’s a non-starter for a lot of consulting and commercial work. By contrast, both llm-tldr and voitta-rag are AGPL v3.

Why function-level chunks beat file-level blobs

The biggest design improvement was moving from file-level rendered analysis to function-level structural chunks.

On voitta-rag indexing itself, that produced 647 stored analysis chunks: 70 file-overview chunks and 577 function chunks. That sounds like more pieces, but it is actually a better unit of retrieval. Agents rarely need a whole philosophical treatise about a file. They need to know that foo() is called from three handlers, imports sqlalchemy.orm, and sits on the hot path for authentication. Function-level chunks make that retrievable directly.

It is also a cheaper way to approximate code intelligence than hauling in a dedicated graph stack. You keep the retrieval surface the agent already understands, but enrich the payload enough to answer the structural questions that retrieval alone cannot.

Related reading: llm-tldr vs voitta-rag: Two Ways to Feed a Codebase to an LLM

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.