{"title":"Multi-Modal Drift: Vision-to-Text Consistency","region":"Global","category":"Architecture","description":"Ensuring agents don't lose logic when switching between image and text.","lastUpdated":"2026-02-23T00:00:00.000Z","steps":["Generate a text description of all images for internal reasoning.","Cross-reference visual observations with the original text prompt.","Enforce a 'Double-Take' logic pass for high-detail image analysis.","Strip non-essential metadata from images to reduce token weight.","Flag and halt if the 'Text' and 'Vision' agents provide conflicting facts."],"url":"https://checklist.day/multi-modal-drift-vision-to-text-consistency"}