LoopRails helps teams building AI agents answer a practical question, one action at a time: which actions a human should review, how much control the human should have, what to show them so they can judge well, and how you confirm the oversight actually catches mistakes. For engineers, tech leads, PMs, and designers shipping agentic systems.
RAIL · Reversible · Authorized · Interruptible · Logged
Two questions decide how to oversee any action: how bad it is if the action goes wrong (the consequence), and whether a person can actually catch and stop the mistake in time (the controllability). The hardest corner is high consequence plus low controllability — where asking a human to review just gives you a rubber stamp. Tap a corner to see what to do.
When the stakes are high but a person can't reliably catch the mistake in time, a confirmation prompt only produces a rubber stamp — and a scapegoat when it goes wrong. So don't rely on review. Prevent the bad outcome: make the action reversible, limit how far the damage can spread (the blast radius), run it in a sandbox (an isolated, contained environment), force a deliberate decision, or block the action and hand it to someone who can decide.
Do: Sandbox-First · Capability Lock · make it reversible · escalate
These failures are real and well-documented — in aviation, medicine, finance, and 2026 studies of AI coding agents. Each card shows what goes wrong and the design change that prevents it. Press “See the fix” on any card.
Apply these to each action your agent can take, not to the system as a whole. They double as four questions for standup: Did we grade this action? How is it guarded? What do we show the person reviewing it? Have we checked the oversight actually works?
Rate each action by how far harm could spread, whether it can be undone, and how bad it is if wrong. That gives a grade from G0 (trivial) to G3 (critical).
Match the controls to the grade. Where a person can't realistically catch the mistake in time, prevent the bad outcome instead of asking them to review it.
Design the review moment well: make the action and its consequences clear, show where the request came from, help the person spot problems — and don't spend more of their attention than the action is worth.
Test whether people actually catch errors, not just whether a review step exists. Attack your own oversight the way you'd attack the agent.
Answer four questions about an action your agent is about to take. LoopRails gives you its grade, the right level of control, and a warning if asking a human to review it would just be a rubber stamp.
RAIL is a checklist of four properties any well-governed action should keep: Reversible, Authorized, Interruptible, Logged.
You can undo it, or the damage is contained. Save hard stop-and-ask gates for the few actions that truly can't be undone.
The agent has only the permissions it needs for this grade — no more. For high-stakes actions, whoever proposes the action isn't the one who approves it.
Anyone can stop it. An andon cord (a no-blame way to halt the agent) lets a teammate pause it; a kill switch stops everything at once, no questions asked.
It leaves a record you can inspect later — so you can prove the oversight actually catches mistakes, not just that it exists.
Named designs you can point to in a meeting — most borrowed from industries that have already learned these lessons the hard way — plus the recurring mistakes you want to be able to name and stop.
Three levels of depth, each one click away. Start with the playbook; follow the citations as far down as you want to go.
The hands-on field guide: a cheat sheet, the pattern and anti-pattern decks, questions to ask in standup, and ready-made recipes.
Open the playbook →The full method (sections 0–10): the consequence-vs-controllability model, the grades, the autonomy ladder, how a single oversight moment is structured, and how to validate it.
Open the framework →366 annotated sources across 17 topics — aviation, medicine, finance, AI safety, and human-computer interaction. Every claim on this site traces back here.
Open the codex →