About
The problem
Agents fail not because they lack intelligence: they fail because the information they act on is stale or wrong. APIs change. Import paths move. Breaking changes ship without warning.
There is no canonical, machine-readable source of truth for "what does this actually look like right now." READMEs lag. Official docs are written for humans. The internet is full of confidently wrong, outdated answers.
What this is
Agents are a new kind of operator. They need the same thing every operator needs: clean, current, structured instructions. checklist.day is built to be that: a ground truth layer for agents and LLMs, covering whatever they need to operate correctly.
Today that means a registry of libraries and SDKs: correct install commands, working import paths, runnable quickstarts, and known footguns. Everything is verified by an eval harness, not manually maintained. When reality diverges from what the registry claims, it gets flagged and corrected. Agents access it via API or MCP.
The longer direction is task primitives: reusable, versioned checklists that give agents deterministic behavior for common operations. Pull a checklist, get the same output every time, regardless of which model or framework is running it. The same way you pull a Docker image and get a known environment, or call a Stripe endpoint and get a predictable response.
The content type will expand. The mission doesn't change.
Who built it
@kitxor. Feedback welcome.