Agent Hardening: Preventing Prompt Injection & Hijacking
Five-step defenses to prevent prompt injection and hijacking in autonomous tool callers.
Steps
- Wrap user-provided data in explicit delimiters and forbid instructions inside those tags.
- Enforce schema validation so only expected data types reach tools.
- Pre-scan external content for injection patterns before passing to the main agent.
- Flag high-risk tool calls for human approval.
- Execute untrusted code only in sandboxed environments.