Every LLM-assisted coding tool faces the same fundamental tension: codebases are too large to fit in a context window. Two recent tools attack this from opposite directions, and understanding the difference clarifies something important about how we’ll work with code-aware AI going forward.
The Shared Problem
llm-tldr is a compression tool. It parses source code through five layers of static analysis — AST, call graph, control flow, data flow, and program dependence — and produces structural summaries that are 90–99% smaller than raw source. The LLM receives a map of the codebase rather than the code itself.
voitta-rag is a retrieval tool. It indexes codebases into searchable chunks and serves actual source code on demand via hybrid semantic + keyword search. The LLM receives real code, but only the relevant fragments.
Compression vs. retrieval. A map vs. the territory.
At a Glance
| llm-tldr | voitta-rag | |
|---|---|---|
| Approach | Static analysis → structural summaries | Hybrid search → actual code chunks |
| Foundation | Tree-sitter parsers (17 languages) | Server-side indexing (language-agnostic) |
| Interface | CLI + MCP server | MCP server |
| Compute | Local (embeddings, tree-sitter) | Server-side |
What Each Does Better
llm-tldr wins when you need to understand how code fits together:
- Call graphs and dependency tracing across files
- “What affects line 42?” via program slicing and data flow
- Dead code detection and architectural layer inference
- Semantic search by behavior — “validate JWT tokens” finds
verify_access_token()
voitta-rag wins when you need the actual code:
- Retrieving exact implementations for review or modification
- Searching across many repositories indexed server-side
- Tunable search precision (pure keyword ↔ pure semantic via
sparse_weight) - Progressive context loading via chunk ranges — start narrow, expand as needed
The Interesting Part
These tools don’t compete — they occupy different layers of the same workflow. Use llm-tldr to figure out where to look and why, then voitta-rag to pull the code you need. Static analysis for navigation, RAG for retrieval.
This mirrors how experienced developers actually work: first you build a mental model of the architecture (“what calls what, where does data flow”), then you dive into specific files. One tool builds the mental model; the other hands you the files.
The fact that both expose themselves as MCP servers makes combining them straightforward — plug both into your editor or agent and let the LLM decide which to call based on the question.