Levie Nailed the Job Description. He Left Out the Hard Part.

Aaron Levie, CEO of Box, posted a thread this week laying out what he thinks will become a standard enterprise role: the agent deployer and manager.

His frame is clean. Every team will need someone whose job is to find the highest-leverage workflows for agents, design the future state, wire up the systems, and run the agents on an ongoing basis with KPIs and evals. He's right that this role is coming. He's right that it's not centralized. He's right that it's a great landing spot for technical people who are leaning into AI.

But the JD has a quiet dependency that almost no enterprise has built yet, and until they do, this person is going to spend 80% of their time as a plumber, not as a workflow designer.

Read the JD carefully

Levie describes the "gnarly part of the work":

Mapping structured and unstructured data flows, figuring out the ideal workflow, getting the agent the context it needs to do the work properly, figuring out where the human interfaces with the agent and at what steps, manages evals and reviews after any major model or data change, and runs and manages the agents on an ongoing basis tracking KPIs.

Every one of those is a context problem.

Mapping data flows means reconciling identifiers across systems that don't share schemas: your CRM's "account," your billing system's "customer," your support tool's "organization," your warehouse's "tenant." Today, this is someone opening five tabs and writing a glossary in Notion that goes stale by Friday.

Getting the agent the context it needs is not a prompt-engineering task. It's a question of whether the agent has a faithful, current view of the org: which workflows touch which systems, which fields are authoritative, who owns the process, what state a given record is actually in right now, not at the last ETL run. Most enterprises do not have this. They have dashboards. Dashboards are not context.

Evals after a model or data change require a ground truth. Ground truth for what? Not for the model. For the org state the model was reasoning over. If your agent's job is "process leads with extra customer signal," the eval has to know what "the customer" means consistently across Salesforce, Gainsight, and the warehouse. If those identifier spaces drift, your evals are measuring noise and you won't know which.

Running agents with KPIs requires observability not just into the agent's outputs, but into the upstream systems whose state the agent is reading from and writing to. When something breaks, the question is rarely "did the model regress?" It's "did the underlying system change shape, and did anyone notice?"

The agent deployer's actual day

Picture this person on day one. They've identified a high-leverage workflow: contract intake. Today it takes legal four days; with agents, it could take four hours. The deployer needs to wire up the contract management system, the CRM, the e-signature platform, the data room, and finance.

Each of those systems has an MCP server, or it doesn't. If it does, the server exposes whatever the vendor decided to expose, with whatever identifier semantics the vendor chose, with no awareness of how that system's records relate to records in any of the other four. If it doesn't, the deployer is now writing one.

Either way, the deployer is now in the integration business. Not the workflow business.

The JD describes someone who is "good at mapping the process and understanding where the value could be unlocked." That's the job we want them doing. The job they'll actually do is reconciling identifiers, debugging stale lookups, and rebuilding their mental model every time a vendor ships a schema change.

This isn't hypothetical. It's already happening at every company that has tried to deploy more than two agents in production. The first agent is fine. The second agent shares 60% of its context surface with the first. By the third agent, the deployers realize they needed a shared substrate from the beginning, and now they're refactoring three integrations while the business keeps asking why the rollout is slow.

What the role actually needs underneath it

The agent deployer Levie describes is real, and the value is real. But the role only scales if the company has solved the layer underneath it. A live, structured representation of how the business actually runs (its systems, its identifiers, its workflows, its ownership) that agents can query and that stays current as the org changes.

Without that layer, every agent deployment is a snowflake. With it, the deployer's job collapses back to what Levie wants it to be: workflow design, human-in-the-loop checkpoints, evals, and KPIs.

This is the part of the agent transformation story that hasn't fully landed yet. Every enterprise leader I talk to is excited about agents. Most have not internalized that agents are only as good as the context substrate beneath them, and that context is not a side quest. It's the load-bearing layer.

We've written about why static alternatives like service catalogs don't cut it, and the a16z piece on data agents needing a context layer made the same case from the investor side. The pattern is consistent.

Levie just wrote the JD for the role that sits on top of that layer. The companies that get the agent deployer doing the job he describes will be the ones that solved the substrate first.

The context layer isn't a follow-on problem. It's the prerequisite.

Planning to hire for this role?

We're working with a small group of enterprises to build the context layer before they do. If you're serious about agent deployment at scale, apply for our design partner program.