llm-tldr vs voitta-rag: Two Ways to Feed a Codebase to an LLM

Every LLM-assisted coding tool faces the same fundamental tension: codebases are too large to fit in a context window. Two recent tools attack this from opposite directions, and understanding the difference clarifies something important about how we’ll work with code-aware AI going forward.

The Shared Problem

llm-tldr is a compression tool. It parses source code through five layers of static analysis — AST, call graph, control flow, data flow, and program dependence — and produces structural summaries that are 90–99% smaller than raw source. The LLM receives a map of the codebase rather than the code itself.

voitta-rag is a retrieval tool. It indexes codebases into searchable chunks and serves actual source code on demand via hybrid semantic + keyword search. The LLM receives real code, but only the relevant fragments.

Compression vs. retrieval. A map vs. the territory.

At a Glance

llm-tldr voitta-rag
Approach Static analysis → structural summaries Hybrid search → actual code chunks
Foundation Tree-sitter parsers (17 languages) Server-side indexing (language-agnostic)
Interface CLI + MCP server MCP server
Compute Local (embeddings, tree-sitter) Server-side

What Each Does Better

llm-tldr wins when you need to understand how code fits together:

  • Call graphs and dependency tracing across files
  • “What affects line 42?” via program slicing and data flow
  • Dead code detection and architectural layer inference
  • Semantic search by behavior — “validate JWT tokens” finds verify_access_token()

voitta-rag wins when you need the actual code:

  • Retrieving exact implementations for review or modification
  • Searching across many repositories indexed server-side
  • Tunable search precision (pure keyword ↔ pure semantic via sparse_weight)
  • Progressive context loading via chunk ranges — start narrow, expand as needed

The Interesting Part

These tools don’t compete — they occupy different layers of the same workflow. Use llm-tldr to figure out where to look and why, then voitta-rag to pull the code you need. Static analysis for navigation, RAG for retrieval.

This mirrors how experienced developers actually work: first you build a mental model of the architecture (“what calls what, where does data flow”), then you dive into specific files. One tool builds the mental model; the other hands you the files.

The fact that both expose themselves as MCP servers makes combining them straightforward — plug both into your editor or agent and let the LLM decide which to call based on the question.

References