sixdegree

AI Agents·Governance·Enterprise AI·Context

Governance Theater Won't Survive Agentic AI

Policies and approval workflows can't constrain agents that don't share your view of reality.


By Craig Tracey ·

Governance Theater Won't Survive Agentic AI

An agent at a mid-sized SaaS company gets a ticket: customer reported a billing discrepancy, route it to the right team. The agent checks the CRM, sees the account is owned by Enterprise Sales East, opens a Jira ticket against the Billing Platform team, notifies the assigned engineer.

Three things were wrong. The account moved to Strategic Accounts in last quarter's reorg. The Billing Platform team was dissolved six weeks ago. The notified engineer left in February.

The ticket sits for four days. The customer churns.

No policy was violated. Every workflow signed off. The audit log is clean. The agent acted on a model of the company that hadn't been true in months.

This is governance theater. The forms are observed. The checks pass. The agent's mental model of the organization is fiction.

The shape of the gap

Human governance evolved against humans. Approval chains, change advisory boards, segregation of duties. These work because the people inside them share an implicit understanding of how the company actually works. A VP doesn't need a memo telling them which team owns what. They were in the standups. They saw the reorg slide.

Agents have none of that ambient context. They have whatever you put in their tools and their prompts. If you want them to act safely, the constraints have to be readable in the channel they're acting through, and they have to reflect the company as it is right now. Not as it was when someone last touched a wiki page.

Most agent deployments don't do this. They do something that looks similar from the outside but works differently underneath:

A policy doc summarized into a system prompt. A guardrails layer pattern-matching on outputs. An approval workflow gating actions defined as high-risk in advance. A logging system that captures what the agent did, for review later.

Each has a place. None solves the opening scenario. The agent had a wrong model of ownership. No prompt, guardrail, approval, or audit catches that. The wrongness is upstream of every control surface.

Wrap vs. structural

The distinction worth making is between governance that wraps the agent and governance embedded in the substrate the agent reasons over.

Wrap-style governance lives outside the reasoning loop. It operates on inputs and outputs, not on the model of reality the agent is using. It scales with the number of policies you can write. It fails silently when the underlying facts are wrong.

Structural governance lives inside the data the agent queries to make decisions. It constrains the agent by giving it accurate, current relationships to reason over. It scales with the systems of record you connect. It fails loudly, because incorrect relationships produce queryable contradictions.

The opening scenario fails because of wrap. The policy says "route tickets to the owning team." The policy is correct. The fact about who owns what is wrong. Wrap-style governance has nothing to say about facts.

Structural governance handles it differently. Before routing, the agent asks: who owns the service tied to this account, right now? That question hits a live context graph built from the actual systems of record. CRM, service catalog, IdP, deploy history. If the graph says the Billing Platform team was dissolved, the agent can't route to a team that doesn't exist. The constraint isn't a rule. It's the resolvable structure of the graph.

The math is why this matters at scale. Wrap-style governance is N×M. Every new agent, every new tool, every new policy expands the matrix. Ten agents, twenty tools, a hundred policies, and you're writing rules against scenarios that already happened.

Structural governance is N+M. Every agent and every tool benefit from the same graph. The graph derives from systems of record people already use. The combinatorial explosion never happens.

What structural actually requires

A live context graph derived from systems of record. Not a hand-curated catalog. Not a quarterly export. A graph that updates from the same events updating Jira, GitHub, Salesforce, the IdP, the deploy pipeline. Ownership, dependencies, scopes, blast radius are queryable elements, not prose buried in policy docs.

A query interface agents can hit at decision time. MCP is the obvious shape in 2026. The agent doesn't carry a stale snapshot in its context window. It asks, gets a current answer, acts on it. Same protocol it uses for everything else.

Constraints expressed as graph properties, not prose. "Engineers can act on services they own" lives in a doc and gets summarized into a prompt. "Engineer X owns service Y" lives in the graph. The first is advisory. The second is structural.

Lineage as a property of the system, not an afterthought. Every agent action leaves a path through the graph: which entity it acted on, which relationships it traversed, what scope it was operating under. Audit isn't a separate pipeline. It falls out of how the agent reasoned in the first place.

This is the substrate question, and it's the one that determines whether your agent program scales or stalls. Catalogs decay because they're maintained by hand. RAG retrieves text without understanding the relationships between the things the text describes. Definitions decay. Relationships don't.

What CIOs should be asking

When an agent makes a decision, can you tell me which entities it queried, which relationships it traversed, and what the state of those relationships was at the moment of the decision? If the answer requires reconstructing from logs after the fact, you're doing theater.

If the org reorgs tomorrow, how long until your agents are acting on the new structure? If it's longer than HRIS propagation, you have context drift no policy will fix.

If the answer is "we have a policy," you're hoping. If the answer is "the relationship doesn't resolve in the graph," you're enforcing.

How many of your guardrails restrict outputs versus ensure correct inputs? Output restriction is necessary and not sufficient. Most realistic agent failures come from bad inputs.

Who maintains the source of truth your agents reason over? If the answer is a platform team or a data team whose job is keeping the graph current, you've recreated the catalog problem. The graph has to derive from systems people already use, or it rots.

What this means

Teams that scale agents through the rest of 2026 will treat governance as a property of their context layer, not a wrapper around their agents. The rest will spend the year writing policies their agents can't see.

Regulatory pressure is real. EU AI Act provisions for high-risk systems are landing this year. But betting your agent strategy on compliance as the forcing function gets the ordering wrong. The forcing function is operational. Agents acting on stale context cause incidents. Incidents cost money and trust. Teams that get the substrate right move faster, with fewer incidents, and arrive at compliance as a side effect.

This is what we're building at SixDegree. A live context graph derived from systems of record, queryable through MCP, built so agents and the humans supervising them reason over the same current truth.

If you're a CIO or CISO trying to scale agents past pilot without owning the next incident, we'd love to talk.

We're onboarding design partners now.

Shape the product. Lock in early pricing. Direct founder access. Limited spots available.