Side Effect Guardrails: Stopping Destructive Actions

Security · updated Mon Feb 23

Implementing checks to prevent agents from unintentionally deleting or modifying critical data.

Steps

Categorize tools into safe and destructive.
Require approval for destructive calls.
Implement a dry-run mode for write tools.
Use soft-deletes for agent-managed databases.
Monitor volume of change per session.

view raw JSON →