voitta-rag: Scoping Your AI’s Knowledge, and a few new features

March 1, 2026March 1, 2026 Gregory Golberg projects/method and apparatus ai-development, aws-glue, mcp, rag, sharepoint, voitta, voitta-rag

A follow-up to our February 13 comparison of llm-tldr and voitta-rag.

Part I: The Search Toggle — Context Management for the Multi-Project Developer

One of the quieter problems with RAG-assisted development is context pollution. You index everything — your client project, your internal tools, that side experiment from last month — and then your AI assistant cheerfully retrieves code snippets from all of them, muddying every answer.

voitta-rag now has a clean answer to this: a per-folder search toggle in the file browser.

voitta-rag search toggle

Each indexed folder has a Search checkbox. Green means its content shows up in search results (and thus in MCP responses to Claude Code or any other connected assistant). Grey means the folder stays indexed — nothing is deleted or re-processed — but it’s invisible to search. Toggle it back on, and it’s instantly available again.

Why this matters

If you consult for multiple clients, or are just working on multiple not very related projects, your voitta-rag instance might hold:

Project A’s monorepo, Jira board, and Confluence space
Project B’s microservices and SharePoint docs
An internal project — say, a lead generation pipeline
A few open-source repos you reference occasionally

Without scoping, a search for “authentication flow” returns results from all of them. Your AI assistant synthesizes an answer that blends Project A’s OAuth implementation with Project B’s API key scheme and a random auth.py from your internal tool. Not wrong, exactly, but not useful either.

With the search toggle, you flip Project B and the internal project off when you’re heads-down on Project A. Searches — including MCP tool calls from Claude Code — only return Project A’s content. When you context-switch, you flip the toggles. It takes one click per folder.

Projects: grouping toggle states

If toggling folders one by one sounds tedious for a large index, voitta-rag also supports projects — named groups of toggle states. Create a “Project A” project and a “Project B” project, each with its own set of active folders. Switching projects flips all the toggles at once.

The active project persists across sessions and is respected by the MCP server, so your AI assistant automatically searches the right scope when you resume work.

Per-user scoping

The toggle is per-user. On a shared instance, each developer can have their own search scope without stepping on each other. Your teammate can be searching across everything while you’ve scoped down to one client — same voitta-rag deployment, different views.

The takeaway

This is a small feature with disproportionate impact. The whole point of a RAG knowledge base is to give your AI assistant relevant context. If you can’t control what “relevant” means, you’re outsourcing that judgment to vector similarity scores — which don’t know that Project A and Project B are different engagements. The search toggle puts that judgment back in your hands.

Part II: What Else Shipped — Glue Data Catalog, UI Polish, and More

Since our last deep-dive, voitta-rag has been on a steady clip of new features. Here’s what landed in the latest batch.

AWS Glue Data Catalog as a Data Source

This is the headline addition. voitta-rag can now sync schema metadata from AWS Glue Data Catalog — databases, tables, columns, partition keys — and index it for RAG search.

The connector (PR #11) renders Glue metadata as markdown: each database becomes a document with a summary table and a per-table breakdown of columns, types, and partition keys. This gets chunked and embedded like any other content.

Why would you want your data catalog in a RAG knowledge base? Because schema questions are exactly the kind of thing developers ask AI assistants all the time:

“Which table has the customer email field?”
“What are the partition keys on the events table?”
“Show me all tables in the analytics database”

Without Glue indexing, the assistant either hallucinates a schema or asks you to go look it up. With it, the answer comes back from your actual catalog metadata — correct, current, and grounded.

The UI offers a region dropdown, an auth method toggle (AWS profile or access keys), and optional catalog ID and database filters. You can index everything or cherry-pick specific databases.

SharePoint Global Sync and Timestamp Visibility

The SharePoint connector got a global sync implementation — configure once, index everything in the site. Additionally, source timestamps are now exposed in MCP search results, so an AI assistant can see when a document was created or last modified, not just its content. This matters for questions like “what changed recently?” or “is this documentation current?”

Multi-Select Dropdowns for Jira and Confluence

Previously, you typed Jira project keys and Confluence space names into a text field — error-prone and tedious if you have dozens. Now there are multi-select dropdown widgets (PR #10) that fetch available projects and spaces from your instance and let you pick. Select “ALL” to dynamically sync everything, including projects or spaces created in the future.

A small but satisfying fix: JQL project keys are now quoted to handle reserved words like IS that would otherwise break queries. The kind of bug you only hit when a real user has a project named something unfortunate.

File Manager UI Overhaul

The file browser got a visual refresh: independent scroll within the file list (headers and sidebar stay fixed), full-width layout, a file count status bar, styled scrollbars, and file extensions preserved when names are truncated. Mostly quality-of-life, but it makes a noticeable difference when you’re browsing a large index.

MCP Improvements

The get_file tool now includes guidance to prefer get_chunk_range for large files — a pragmatic touch. When an AI assistant tries to fetch a 10,000-line file, it’s better to get a targeted range of chunks than to blow up the context window.

SharePoint ACL Sync — Permission-Aware Search

This is the most architecturally significant addition in this batch. voitta-rag now syncs SharePoint Online permissions (ACLs) alongside document content, so search results respect who’s allowed to see what.

SharePoint’s permission model is deceptively complex: permissions flow down from site → library → folder → file through an inheritance chain, but any object in the chain can break inheritance (e.g., when someone shares a file with a colleague who doesn’t have parent-level access). Effective permissions for a given file might come from the file itself, a parent folder three levels up, or the site root.

The new ACL sync walks this hierarchy via the Microsoft Graph API, resolves effective permissions per file, and stores them in the vector index alongside the document chunks. At search time, results are filtered by the requesting user’s identity — you only see content you’d be allowed to see in SharePoint itself.

The implementation includes an acl-probe diagnostic endpoint that lets you inspect permissions on a sample of files without triggering a full sync — useful for debugging “why can’t user X see document Y?” scenarios.

An 800-line research document covers the SharePoint permission model, Graph API capabilities and limitations, and design decisions. Worth reading if you’re building anything that needs to reason about SharePoint access control.

Microsoft OAuth Login

voitta-rag now supports Microsoft OAuth as a login provider, alongside the existing authentication methods. For organizations already on Microsoft 365, this means users can sign in with their work accounts — and those identities can be matched against SharePoint ACLs for permission-aware search. A .env.sample file documents all the configuration options.

Landing Page Rebrand

A small but notable change: the landing page now reads “Voitta RAG” instead of the previous branding. The project has a clear identity now.

Wrapping Up

The search toggle and project system solve a real workflow problem — context management when you’re juggling multiple codebases. The Glue Data Catalog connector extends voitta-rag’s reach beyond code and documents into infrastructure metadata. The SharePoint ACL sync adds enterprise-grade access control to RAG search — which matters a lot once you’re indexing sensitive documents across an organization. And the UI, connector, and auth improvements continue to sand down the rough edges.

All of it still runs on your infrastructure. Nothing phones home. If you’re building with MCP-connected AI assistants and want a self-hosted knowledge layer, voitta-rag is worth a look.

voitta-rag Grows Up, voitta-yolt Is Born: February Updates from Voitta AI

February 25, 2026March 4, 2026 Gregory Golberg method-and-apparatus, voitta ai-tools, claude code, developer-tools, mcp, open-source, rag, voitta, voitta-rag

A follow-up to our February 13 comparison of llm-tldr and voitta-rag.

Part I: voitta-rag — From Code Search to Knowledge Platform

When we last looked at voitta-rag, it was a solid hybrid search engine for codebases — index your repos, search via MCP, get actual code chunks back. Twelve days and 11 commits later, it’s become something broader: a self-hosted knowledge platform that indexes not just code but your entire work graph.

Here’s what landed since February 13.

Enterprise Connectors: Jira, Confluence, SharePoint

The biggest expansion is connector coverage. voitta-rag now syncs from Jira, Confluence, and SharePoint alongside the existing Git, Google Drive, Azure DevOps, and Box integrations.

Jira and Confluence support both Cloud (API token with Basic auth) and Server/Data Center (PAT with Bearer auth), selectable via dropdown in the UI — a detail that matters because plenty of enterprises still run on-prem Atlassian. Cloud uses the v3 search endpoint (v2 is deprecated), and Confluence Cloud correctly routes through /wiki/rest/api.

SharePoint got a full global sync implementation. And on the UI side, both Jira projects and Confluence spaces now use multi-select dropdown widgets — you can cherry-pick specific projects or select “ALL” to dynamically sync everything, including future additions. Practical touch: JQL project keys are now quoted to handle reserved words like IS that would otherwise break queries.

Time-Aware Search

Search results are no longer timeless. voitta-rag now tracks source timestamps — created_at and modified_at — propagated from every remote connector through a .voitta_timestamps.json sidecar file into the indexing pipeline and vector store.

This enables time range filtering on the MCP search tool via date_start/date_end parameters. “What changed in the last week?” is now a first-class query. For an AI assistant trying to understand recent activity across repos, Jira boards, and Confluence spaces simultaneously, this is a significant upgrade.

Anamnesis: Persistent Memory for AI Assistants

The most architecturally interesting addition. Anamnesis (Greek for “recollection”) gives AI assistants a persistent memory layer backed by voitta-rag’s vector store.

Six new MCP tools let an assistant create, retrieve, update, delete, like, and dislike memories. The like/dislike mechanism adjusts relevance scoring — memories the assistant finds useful surface more readily over time, while unhelpful ones fade. It’s essentially a learning loop: the AI assistant builds up a knowledge base of its own observations and decisions, searchable alongside the actual indexed content.

This turns voitta-rag from a read-only knowledge base into a read-write one — the assistant doesn’t just consume context, it contributes to it.

Per-User Search Visibility

A multi-tenancy feature: users can now enable or disable folders for their own search scope without affecting other users. If you’ve indexed 50 repos but only care about 5 for your current task, you toggle the rest off. The MCP server respects these per-user visibility settings, so AI assistants scoped to different users see different slices of the same knowledge base.

More File Types

The indexing pipeline now handles AZW3 (Amazon Kindle) files, joining the existing support for DOCX, PPTX, XLSX, ODT, ODP, and ODS. Not the most common format in a work context, but it signals that voitta-rag is thinking beyond code and office docs toward general document ingestion.

The Bigger Picture

Two weeks ago, voitta-rag was a code search tool. Now it indexes your Git repos, Google Drive, SharePoint, Jira, Confluence, Box, and Azure DevOps — with time-aware search, per-user scoping, and persistent AI memory. The trajectory is clear: it wants to be the single search layer across everything your team produces, exposed to AI assistants via MCP.

The self-hosted angle remains the key differentiator. Nothing leaves your network. For teams where that matters (and increasingly, it does), this is starting to look like a serious alternative to cloud-hosted RAG services.

Part II: voitta-yolt — You Only Live Twice

Brand new from Voitta AI today: voitta-yolt (You Only Live Twice) — a safety analyzer for Claude Code that statically analyzes Python scripts before execution.

The Problem

Claude Code can write and run Python scripts. That’s powerful and dangerous in equal measure. By default, you either pre-approve all Python execution (fast but risky) or manually approve each script (safe but maddening). Neither is great.

How YOLT Works

YOLT registers as a Claude Code PreToolUse hook on the Bash tool. When Claude Code runs python3 script.py, YOLT intercepts the command, parses the Python AST, and walks every function call against a configurable rule set:

Safe scripts (pure computation, data parsing, read-only operations) get auto-approved — no permission prompt.
Destructive scripts (file writes, AWS mutations, subprocess calls, network POSTs, database connections) get flagged for human review with specifics about what was detected, including the source line content.

Zero external dependencies — it’s pure stdlib (ast, json, fnmatch, shlex). AST parsing is near-instant, so there’s no perceptible delay.

The Rule System

The default rules are sensible and well-structured:

AWS boto3: describe/list/get/head → safe. delete/put/create/terminate → destructive. Rules scope via trigger_imports, so cache.delete_item() in a non-AWS script won’t false-positive.
File I/O: open() in write modes, os.remove, shutil.rmtree → destructive. Read-only access is fine.
Subprocess: Always flagged. subprocess.run, os.system, the lot.
Network: requests.get → safe. requests.post/put/delete → destructive.
Database: Connection creation → flagged for review.

A curated list of safe imports (json, csv, re, datetime, pathlib, hashlib, and ~50 others) means scripts that only use standard library data-processing modules sail through without interruption.

Custom rules go in ~/.claude/yolt/rules.json and merge with defaults — you can add safe methods, define new categories with their own trigger_imports, and use glob patterns (fetch_*, drop_*).

One Important Gotcha

If you have Bash(python3:*) in your Claude Code settings.local.json allow list, YOLT’s hook never fires — static allow rules take precedence over PreToolUse hooks. YOLT replaces the need for that allow rule entirely: safe scripts get auto-approved by the hook itself.

Why This Matters

The design philosophy — “false positives OK, false negatives not” — is the right one for a safety tool. It’s the security principle of fail-closed applied to AI code execution.

YOLT is small (527 lines across 6 files in the initial commit), focused, and immediately useful. If you’re letting Claude Code run Python, this is the kind of guardrail that should exist by default.

Wrapping Up

voitta-rag is evolving from a code search tool into a self-hosted knowledge platform with enterprise connectors and AI memory. voitta-yolt tackles a different but equally practical problem: making AI code execution safer without making it slower.

llm-tldr vs voitta-rag: Two Ways to Feed a Codebase to an LLM

February 13, 2026March 4, 2026 Gregory Golberg holding forth, method-and-apparatus, self-promotion, voitta coding-tools, developer-tools, llm, mcp, method-and-apparatus, rag, static-analysis, voitta, voitta-rag

Every LLM-assisted coding tool faces the same fundamental tension: codebases are too large to fit in a context window. Two recent tools attack this from opposite directions, and understanding the difference clarifies something important about how we’ll work with code-aware AI going forward.

The Shared Problem

llm-tldr is a compression tool. It parses source code through five layers of static analysis — AST, call graph, control flow, data flow, and program dependence — and produces structural summaries that are 90–99% smaller than raw source. The LLM receives a map of the codebase rather than the code itself.

voitta-rag is a retrieval tool. It indexes codebases into searchable chunks and serves actual source code on demand via hybrid semantic + keyword search. The LLM receives real code, but only the relevant fragments.

Compression vs. retrieval. A map vs. the territory.

At a Glance

	llm-tldr	voitta-rag
Approach	Static analysis → structural summaries	Hybrid search → actual code chunks
Foundation	Tree-sitter parsers (17 languages)	Server-side indexing (language-agnostic)
Interface	CLI + MCP server	MCP server
Compute	Local (embeddings, tree-sitter)	Server-side

What Each Does Better

llm-tldr wins when you need to understand how code fits together:

Call graphs and dependency tracing across files
“What affects line 42?” via program slicing and data flow
Dead code detection and architectural layer inference
Semantic search by behavior — “validate JWT tokens” finds verify_access_token()

voitta-rag wins when you need the actual code:

Retrieving exact implementations for review or modification
Searching across many repositories indexed server-side
Tunable search precision (pure keyword ↔ pure semantic via sparse_weight)
Progressive context loading via chunk ranges — start narrow, expand as needed

The Interesting Part

These tools don’t compete — they occupy different layers of the same workflow. Use llm-tldr to figure out where to look and why, then voitta-rag to pull the code you need. Static analysis for navigation, RAG for retrieval.

This mirrors how experienced developers actually work: first you build a mental model of the architecture (“what calls what, where does data flow”), then you dive into specific files. One tool builds the mental model; the other hands you the files.

The fact that both expose themselves as MCP servers makes combining them straightforward — plug both into your editor or agent and let the LLM decide which to call based on the question.

References

Reverse-engineering and keratinous biomass reduction in bos grunniens

August 11, 2025August 21, 2025 Gregory Golberg projects, voitta, yak shaving, zoominfo architecture, coding-agents, diagrams, dot, graphviz, llm, mcp, mermaid, svg, tooling, tools, yak-shaving

Not that we needed all that for the trip, but once you get locked into a serious drug collection, the tendency is to push it as far as you can.

Hunter S. Thompson

Reverse-engineering is kinda fun. More fun when we can shave the yak by adding more tools to our LLM/MCP toolbox, amirite?

So I accidentally came across this LinkedIn post, about an SVG diagramming tool for Claude. I was just working on some diagrams as part of reverse engineering and having been making agents create those with Mermaid, but I thought I’d give it a try.

Well, that was a flock of wild geese chasing a red herring down a rabbit hole to borrow a shear…

First, I thought the idea was clever, but I wanted more cowbell (because we don’t have enough animals in this post), so I forked that and vibe-coded an MCP server on top of that.

Then I tried to use it to create a few architecture diagrams but I found it actually somewhat lacking. When the client (Claude Desktop) was using it, I didn’t love the editing capability. When the client was not using it, it created nicer-looking diagrams somehow (in SVG, yes) and with legends and stuff. But of course the graph layout still sucked. So I’d need to manually edit it.

Well, screw that, said I. I’ll use AWS MCP server, said I.

Screw that, said I next.

Then I modified the prompt to ask not for SVG but for DOT format of GraphViz. Much better, I said. And then, uh… It could have gone better, right? But at this point I’m not sure how to improve the prompt.

But I know what to do when I don’t know something, right?

Yes. I put the DOT file to the LLM and ask it to tweak it to have a certain thing. Then I ask why. Then I, of course, ask it, to fix this original prompt. And it’s turtles (yes, we’re in a zoo and you’re reading it on a Safari) all the way down.

And what do we learn, Palmer? Well, never mind, let us draw the curtain of charity over the rest of this scene.

(Well, not quite true — using DOT is the better thing to do here than explicitly doing things like “30px” instructions).

NOTE: multiple individuals of bos grunniens species have undergone keratinous biomass reduction, which also included:

Env var syncer for apps not launched for terminal, which in turn is also useful for
MCP config and prompt syncer

The moral of the story is absent.

Coding assistants musing

July 27, 2025 Gregory Golberg holding forth, projects, voitta, yak shaving ai, anthropic, claude, coding-agents, intellij, java, jetbrains, kotlin, llm, mcp

I love me my Cline, Claude Code and company. But there’s major thing I found missing from them — I want my assistant to be able to step with me through a debugger, and be able to examine variables and call stack. Somehow this doesn’t exist. This is helpful for figuring out the flow of an unfamiliar program, for example.

Now, JetBrains MCP Server Plugin gets some of the way there, but… It can set breakpoints but because of the way it analyzes code text it often gets confused. For example, when asked to set a breakpoint on the first line of the method it would do it at a method signature or annotation.

And it doesn’t do anything in terms of examining the code state at a breakpoint.

So I decided to build on top of it, see JetBrains-Voitta plugin (based on a Demo Plugin). It:

Uses IntelliJ PSI API to provide more meaningful code structure to the LLM (as AST)
- This helps with properly setting breakpoints from verbal instructions
- Hopefully also this should prevent some hallucinations about methods that do not exit (educated guess).
Adds more debugging capability, such as inspecting the call stack and variables at a given breakpoint.

Here are a couple of example debug sessions:

Much better.

And completely vibe-coded.

Maybe do something with Cline next?

MCP protocol of choice: stdin/stdout? WTF, man?

April 21, 2025April 27, 2025 Gregory Golberg projects, self-promotion, silliness, voitta, yak shaving anthropic, debug, debugging, llm, mcp, protocol, protocols, rant, sse

Let’s talk about MCP. More specifically, let’s talk about using stdin/stdout as a protocol transport layer in the year of our Lord 2025.

Yes, yes—it’s universal. It’s composable. It works “everywhere.” It’s the spiritual successor to Unix pipes, which were cool at the time. The time when my Dad was hitting on my Mom. As an actual transport layer, stdin/stdout is a disaster.

Debugging Is Basically a Crime

Let’s say I want to create an MCP server in Python. Reasonable. Now let’s say I want to debug it. Set a breakpoint. Inspect variables. Use threads. Maybe spin up the LLM in the same process for context. You know, software engineering.

The moment you try to do this, you’re writing a debug driver. Congratulations. You are now:

Building a fake client to simulate a streaming LLM
Implementing bidirectional IO while praying the LLM doesn’t send surprise newline characters
Wrapping things in threads and/or asyncio or multiprocessing or whatnot other total fucking bullshit.

Been there. Twice:

Voitta’s Brokkoly: Thought I could run the LLM and the driver in one process. Spent 3 hours implementing queues, got it half-working, and realized I was debugging my own debug tool.
Samtyzukki: Round two. Same problem. Ended up with more abstraction layers than a Kafka conference.

Eventually, I just gave up and decided to use SSE (Server-Sent Events). Because you know what’s great about SSE? You can log things. You can see the messages. You can debug. It’s like rediscovering civilization after weeks of wilderness survival with only printf() and trauma.

stdout Is Sacred, Until It Isn’t

Here’s the other problem. stdout is a shared space. You can’t count on it. Libraries will write to it. Dependencies will write to it. Your logger will write to it. Some genius upstream will write:

print(“INFO: falling back to CPU because the GPU is feeling shy today.”)

Congratulations. You just corrupted your transport. Your parser reads that as malformed JSON or a broken packet or an existential and spiritual crisis.

It’s not a bug. It’s a design decision—and not a good one.

This is the part where I invoke Rob Pike. Sorry. Not sorry.

In Go, to format a date, one doesn’t simply use YYYY-MM-DD. You do Mon Jan 2 15:04:05 MST 2006.

Because, I get it, we all need to get high once in a while. But srsly.

Things are moving fast

March 27, 2025March 27, 2025 Gregory Golberg holding forth, voitta ai, anthropic, llm, mcp, openai

OpenAI moving to standardizing on MCP after proposing function calling, meanwhile we at https://voitta.ai/ are watching with hope and anxiety.