Skip to main content

Security

Fighting the Unfixable: The State of Prompt Injection Defense

Fighting the Unfixable: The State of Prompt Injection Defense

·2193 words·11 mins
Prompt injection is architecturally unfixable in current LLMs, but defense-in-depth works. Training-time defenses like Instruction Hierarchy, inference-time techniques like Spotlighting, and architectural isolation create practical systems. Microsoft’s LLMail-Inject showed thatadaptive attacks succeed at 32% against single defenses, 0% against layered approaches. Real failures like GitHub Actions compromise prove that securing obvious surfaces isn’t enough. Like SQL injection, it’s manageable with layering.
When Data Becomes Instructions: The LLM Security Problem Hiding In Plain Sight

When Data Becomes Instructions: The LLM Security Problem Hiding In Plain Sight

·2625 words·13 mins
LLMs fundamentally cannot distinguish between instructions and data. Whether you’re building RAG systems, connecting MCP servers to your data platform, or just using AI tools with sensitive information, every retrieved document is a potential instruction override. The Wall Street Journal just proved this by watching Claude lose over $1,000 running a vending machine after journalists convinced it to give everything away for free.