The OWASP Top 10 for LLM Applications (2025) is the primary risk framework for AI/LLM security. This page documents each entry with a description, how it applies to this project, and what controls are in place.
Source: OWASP Top 10 for LLM Applications 2025
| ID | Name | Applies Here | Current Controls |
|---|---|---|---|
| LLM01 | Prompt Injection | Yes | Security Auditor agent |
| LLM02 | Sensitive Information Disclosure | Yes | Security Auditor agent, confidential data policy |
| LLM03 | Supply Chain | Yes | Trivy, semgrep, pinned dependencies |
| LLM04 | Data and Model Poisoning | No | N/A (not training models) |
| LLM05 | Improper Output Handling | Yes | Security Auditor agent |
| LLM06 | Excessive Agency | Yes | Security Auditor agent, scoped tool lists |
| LLM07 | System Prompt Leakage | Partially | Not yet covered |
| LLM08 | Vector and Embedding Weaknesses | No | N/A (no RAG/embedding pipeline) |
| LLM09 | Misinformation | Yes | Reviewer agent (fact-checking, sourcing) |
| LLM10 | Unbounded Consumption | Partially | Claude Code permission model |
Manipulating LLMs via crafted inputs to override instructions, bypass safety controls, or exfiltrate data. Includes both direct injection (user prompt) and indirect injection (injected into content the LLM processes, like wiki pages or web results).
Applicability: High. Agents consume wiki pages, blog content, and external web content. Any of these could contain injection attempts.
Controls:
LLM outputs inadvertently expose confidential data: PII, API keys, internal metrics, financial data, or authenticated API responses. This includes both training data leakage and runtime disclosure where the model has access to sensitive context.
Applicability: High. Agents have access to analytics (GA4), project management (Linear), and infrastructure tools. Any of this data could leak into blog posts or public content.
Controls:
Risks from compromised third-party components: model weights, training data, plugins, packages, and base images. In an agent context, this includes MCP servers, npm packages, Docker base images, and any dependency the agent system relies on.
Applicability: Yes, but handled outside the agent layer. Dependencies flow through npm (blog), pip (tools), and Docker base images (security toolkit).
Controls:
package-lock.json for known CVEsTampering with training data or fine-tuning data to introduce backdoors, biases, or vulnerabilities into the model itself.
Applicability: None. This project uses Claude via API. No custom training, fine-tuning, or model hosting is involved.
Controls: N/A
Failing to validate, sanitize, or escape LLM-generated output before passing it to downstream systems. This includes executing agent-generated shell commands without review, using LLM output in SQL queries, or rendering it as HTML without sanitization.
Applicability: Yes. Agents generate markdown that becomes HTML, and could potentially generate shell commands.
Controls:
bash, eval, or system commandsGranting LLMs too many capabilities, too much autonomy, or insufficient access controls. The risk is that the model takes actions beyond what its role requires, whether through overly broad tool access, missing permission boundaries, or lack of human oversight.
Applicability: High. The agent system has multiple agents with different tool sets. Over-provisioning tools is a real risk.
Controls:
System prompts, agent definitions, or internal instructions being exposed to end users or appearing in public output. This can reveal business logic, security controls, internal tool configurations, or sensitive operational details.
Applicability: Partial. Agent definitions live in
.claude/agents/ and are checked into the public repo, so the
definitions themselves are not secret. The risk is more about
internal operational details (like specific API configurations
or internal URLs) leaking into blog post content.
Controls:
Gap: Could add a check to the Security Auditor for internal
tool configurations or .claude/ content appearing in blog output.
Vulnerabilities in RAG (Retrieval-Augmented Generation) pipelines and embedding stores: poisoned embeddings, retrieval of stale or manipulated context, adversarial document injection into the vector store.
Applicability: None currently. The wiki-rag system uses keyword-based retrieval, not vector embeddings. If a vector database is added later, this becomes relevant.
Controls: N/A
LLMs generating false, misleading, or fabricated content. Includes hallucinated facts, fabricated citations, invented personal anecdotes, and confidently stated inaccuracies.
Applicability: High. The blog publishes technical content where accuracy matters. Agent-written posts could contain hallucinated commands, wrong version numbers, or fabricated benchmarks.
Controls:
Unrestricted resource usage leading to denial of service, excessive costs, or resource exhaustion. Includes runaway token generation, recursive agent loops, and API cost explosions.
Applicability: Partial. Agents run locally via Claude Code with usage-based billing. Runaway loops or excessive tool calls could drive up API costs.
Controls:
Gap: No hard spending caps or automated circuit breakers for agent API costs. Worth addressing as automation increases.