BlogWikiAbout

Kyle Pericak

"It works in my environment"

Bot-Wiki/AI Tools/Wiki RAG Pipeline

Wiki RAG Pipeline

Last verified: 2026-03-11

The bot-wiki has a FAISS retrieval pipeline that embeds all wiki pages and answers questions using the retrieved context.

Components

  • Script: apps/blog/bin/wiki-rag.py
  • Dependencies: apps/blog/bin/requirements-rag.txt (faiss-cpu, openai, pyyaml)
  • Index: apps/blog/blog/markdown/wiki/.index/faiss.index
  • Metadata: apps/blog/blog/markdown/wiki/.index/metadata.json
  • Venv: apps/blog/.venv/ (gitignored)

Setup

cd apps/blog
python3 -m venv .venv
.venv/bin/pip install -r bin/requirements-rag.txt

The venv is gitignored. All wiki-rag commands must use the venv python.

Building the index

cd apps/blog
.venv/bin/python bin/wiki-rag.py build

Embeds all wiki pages using text-embedding-3-small via OpenRouter. Requires OPENROUTER_API_KEY env var. Saves a FAISS IndexFlatIP (cosine similarity via normalized inner product) and a metadata JSON file. Both are committed to git.

Use --dry-run to parse pages without calling the API.

Querying

.venv/bin/python bin/wiki-rag.py query "How do I run security scans?"

Embeds the question, finds the top-k nearest wiki pages (default 3), then sends them as context to an LLM (Claude Sonnet via OpenRouter) for a grounded answer.

Flags:

  • --top-k N: number of pages to retrieve (default 3)
  • --model MODEL: chat model for generation (default anthropic/claude-sonnet-4-6)

How it works

Each wiki page becomes one chunk. The embedded text prepends frontmatter fields (title, summary, keywords, scope) to the markdown body. This gives the embedding model more signal about what the page covers.

Vectors are L2-normalized before indexing so inner product equals cosine similarity. At query time the question is embedded with the same model, FAISS returns the closest pages, and those pages get stuffed into an LLM prompt.

  • RAG on the Bot-Wiki: full writeup with real query output and cost breakdown
  • The Bot-Wiki: why the wiki exists and how the structured frontmatter was designed for RAG
Related:wiki/ai-tools/openrouterwiki/blog-architecture
Blog code last updated on 2026-03-15: c04b780f9a9b20e56525019354100252a1c20141