Context Poisoning: RAG Injection Guardrails

Security · updated Mon Feb 23

Preventing 'Indirect Prompt Injection' via retrieved documents.

Steps

Sanitize retrieved chunks for 'Ignore previous instructions' patterns.
Isolate system instructions from RAG context using delimiters.
Implement a 'Pre-Ingestion' LLM filter to flag instruction-like text.
Use a 'Read-Only' persona for agents processing untrusted RAG data.
Sign and verify the origin of all documents in the vector store.

view raw JSON →