Skip to main content

Command Palette

Search for a command to run...

"The AI did it" is not an audit answer

Updated
4 min read

Only 1 in 3 enterprises say they're governance-ready for autonomous agents (McKinsey's 2026 AI Trust Report). In Grant Thornton's 2026 AI Impact Survey, 78% of executives lacked strong confidence they could pass an independent AI governance audit within 90 days. Those two numbers get filed under "AI is hard to govern." I think the real problem is more specific, and more fixable: most teams have no record of what their agents were actually allowed to do.

The audit you can't pass yet

Run the scenario. An AI agent shipped a change overnight — a migration, a deploy, a config edit — and something broke. The review starts, and it asks two boring questions: what control was in place at the moment the action ran, and where is the proof?

For most teams running agents in 2026, the honest answer is "a sandbox, RBAC, and a dashboard." None of the three answers the question. A dashboard is a rear-view mirror — it tells you what happened after it happened. RBAC defines who the agent is and what it could theoretically touch; it doesn't decide whether this specific action, right now, should have been permitted, and it doesn't log that decision. A sandbox limits blast radius; it doesn't produce an artifact an auditor would accept as a sign-off.

The numbers bear this out. In Kiteworks' 2026 survey of 225 enterprise leaders, 33% had no evidence-quality audit trail for AI operations, 61% had logs fragmented across systems, and 60% said they couldn't quickly terminate a misbehaving agent.

What the human approval step quietly did

For a decade, CI/CD release discipline rested on a simple property: certain actions didn't happen without a human signing off. We usually credit that step with catching bad changes. But it did a second thing, for free — it left a record. A name, a timestamp, an approval. The control and the evidence were the same event.

When you move an agent into an unattended pipeline, you remove the human from that step. You lose the control, and you lose the paper trail in the same move. The two compensating controls everyone reaches for — sandboxing and RBAC — replace neither half cleanly.

Make the gate decision the audit artifact

The fix is to put a deterministic decision back at the moment of action, and to make that decision log itself.

ThumbGate (github.com/IgorGanapolsky/ThumbGate) runs in the PreToolUse hook, locally on the machine where the agent executes. Before any tool call runs, it evaluates the call against active checks and blocks the dangerous ones — rm -rf outside the workdir, secret/.env exfiltration, force-push to a protected branch, destructive migrations, package-lock resets. Two properties make it an audit answer and not just a safety net.

The decision is deterministic. The runtime gate is a literal pattern match, then AST match, then scoped rule lookup. There is no LLM on the enforcement path. That matters for audit because the same input always produces the same verdict — you can explain why an action was blocked or allowed without hand-waving about model behavior. It also means a prompt injection has nothing to negotiate with. You can't jailbreak a regex.

The decision is the evidence. Every gate decision — block, allow, or reroute — is preserved with the rule version, the timestamp, and the reviewer path. That's the record the human sign-off used to leave, generated automatically on every consequential action instead of reconstructed from logs after an incident. The blocked-force-push line, with its rule version and timestamp, is the thing you hand the auditor.

For regulated teams

The same engine carries policy templates for regulated work — legal intake (blocking unauthorized practice of law, requiring conflict clearance), financial compliance (gating AI-generated recommendations and disclosures), and healthcare (preventing diagnoses, enforcing HIPAA-compliant routing) — with compliance audit export at the org tier. The point isn't the specific rules; it's that "an AI agent did something consequential" stops being an unrecorded event.

The honest version

ThumbGate doesn't make your agent smarter, and it isn't a governance program by itself — you still need policy, ownership, and review. What it removes is the worst answer in the room: "the AI did it, and we don't have a record of what it was allowed to do." It puts a deterministic approval step back at the moment of action and makes that step leave a trail. It's local-first and MIT-licensed. npx thumbgate init wires it into Claude Code, Cursor, Codex, Gemini CLI, Amp, Cline, or OpenCode in about 30 seconds.

If an auditor asked tomorrow what controlled your AI agents — and where the proof is — what would you show them?

Repo: https://github.com/IgorGanapolsky/ThumbGate

K
Ken11d ago

This is the right audit framing. Logs show activity, but they usually do not prove that a specific action was admissible at the moment it crossed into execution.

The missing artifact is closer to a decision receipt: inputs/facts observed, policy or rule applied, authority boundary, allow/deny result, and the accountable gate for that decision.

Post-hoc reconstruction helps debugging; it is not the same evidence as a pre-action gate.

I

Exactly — and "decision receipt" is the better name. A log answers what happened. A receipt answers was this admissible, by which rule, at the moment it crossed into execution — and only the gate that made the allow/deny call can emit that, because it's the one thing holding the inputs, the active policy, and the verdict at decision time.

Post-hoc reconstruction can't recover that even in principle: the policy may have changed since, and the counterfactual ("checked against rule X v3, denied") was never captured unless something recorded it at the boundary. Your field list is the right schema — inputs observed, rule applied, authority boundary, allow/deny, accountable gate.

That maps almost 1:1 onto what a PreToolUse gate can write per decision (rule + version, timestamp, verdict, reviewer path). The two I'd push to make first-class are the two you named that systems usually drop: the explicit authority boundary and a named accountable gate — not just "denied," but "denied by gate G under boundary B." Those are the fields a reviewer always wishes existed six weeks later.

Are you building toward this? I'd genuinely like to compare notes on the receipt schema.

K
Ken10d ago

Yes, partly in nxusKit SDK, though I think the pattern matters independently of any one implementation. The key constraint, to me, is that the receipt has to be emitted by the boundary component, not reconstructed later by the agent that wanted the tool call. I’d separate observed facts, requested action, policy/rule id + version, authority boundary, gate identity, verdict, reviewer/escalation path, and the repair/appeal packet if denied. The hard part is keeping it small enough to emit on every consequential action while still being useful six weeks later. I’d be glad to compare notes on the schema.