Complete Execution: Fixing Truncated Agent Responses
Practical steps for handling truncation when responses hit length limits.
Steps
- Reserve token headroom to avoid hitting length limits.
- Break long outputs into explicit sections.
- Continue automatically when finish_reason is length.
- Use streaming to detect truncation and retry safely.
- Compress output with denser formats.