Complete Execution: Fixing Truncated Agent Responses

Reliability · updated Mon Feb 23

Practical steps for handling truncation when responses hit length limits.

Steps

  1. Reserve token headroom to avoid hitting length limits.
  2. Break long outputs into explicit sections.
  3. Continue automatically when finish_reason is length.
  4. Use streaming to detect truncation and retry safely.
  5. Compress output with denser formats.

view raw JSON →