AI Agents Need Workforce-Style Management, Not Tool-Style Neglect

The Management Vacuum

Three months ago, your team deployed an AI agent. The demo was clean. The pilot results were promising. Someone filed a ticket to revisit the setup "after things settle down," and that ticket is still open.

The agent is running. You aren't managing it.

That is a management failure. It happens when an organization treats AI deployment as a one-time event rather than the start of an ongoing management responsibility. Agents enter production without the infrastructure that makes any delegated worker reliable: without owners, without performance standards, without review schedules, without a defined path out when the work changes.

The result is drift. Quiet, gradual, and invisible until something goes wrong.

Deloitte's 2026 State of AI report found that only 21% of companies planning to deploy agentic AI within two years have mature agent governance in place. The gap between deploying an agent and managing one is where most enterprise risk accumulates. When that gap surfaces, it is rarely one incident. It is usually an audit question nobody can answer, accountability that was never assigned, or output that has been wrong for weeks before anyone noticed.

Ninety Days of Drift

Drift isn't dramatic. It doesn't announce itself.

It looks like an agent whose prompt was calibrated in February against a policy document updated in March. It looks like output that's mostly correct but wrong in edge cases nobody has sampled. It looks like a data access permission scoped broadly during the pilot and never tightened when the pilot became permanent.

By day ninety, the agent has been producing work your team trusts because it's always been fine. But nobody knows if it's still fine. Nobody is looking.

The drift between what an agent was supposed to do and what it's actually doing won't be obvious from a green status light. It shows up in a customer complaint, a compliance flag, or a manager noticing that the agent's outputs don't match current policy. At that point, the question "how long has this been happening?" is unanswerable.

Most functions have deployed and hoped, not because anyone is reckless, but because nobody handed them a management playbook when the agent went live.

Six Disciplines Every Agent in Your Function Needs

Managing an AI agent draws on the same disciplines that make any delegated work reliable. The translation isn't complicated, but it has to be intentional.

Identity and role. Every agent in production should have a named owner and a written purpose statement. The prompt is the onboarding artifact. It should specify what the agent is for, what it's not for, what inputs it accepts, and what outputs it produces. Ownership is not a shared responsibility or a committee assignment. It is a named person with the authority to change, pause, or retire the agent, and the accountability for what the agent produces. If you can't point to a document and a name, the agent doesn't have an identity. It has an implementation. If ownership is contested between the build team, the budget owner, and the accountable manager, that contest is the first management decision to settle.

Permissions and boundaries. What data sources can the agent read? What can it write to? What actions require a human approval gate before they execute? These questions belong to the deployment conversation, not the incident postmortem. The management question that follows access scoping is simpler: who reviews those permissions, and on what schedule?

Performance and quality. What does good output look like for this agent? If you don't have a written answer, you can't sample against it. Set accuracy thresholds, tone expectations, and scope limits. Sample actual output each week, not an automated alert, actual output reviewed by a person who knows what "correct" looks like for this task.

Once an agent produces consistent, stable output over weeks, ask whether the task still requires probabilistic reasoning every time it runs. Converting stable, repetitive work to deterministic controls (rules, versioned prompts, structured code) removes interpretation drift entirely. The conversion effort ranges from a brief workflow configuration to a focused engineering task, depending on the system. The goal is not to manage drift indefinitely. It is to eliminate the drift you can eliminate.

Review cadence. Schedule a recurring review of each agent with someone who has the authority to act on what they find. This isn't a dashboard check. It's a structured conversation: is the agent still doing what it was designed to do, has the work changed around it, and does anything need to be adjusted?

Escalation and exceptions. What happens when the agent encounters something it shouldn't handle? Who gets notified, and how? What's the rollback procedure if the agent produces a week of bad output before anyone notices? These paths need to be defined before the incident, not improvised during it.

Lifecycle and retirement. Agents accumulate scope creep. Agents outlive the problems they were designed to solve. Build a scheduled review into every deployment: when does this agent get audited for scope drift, and when does it get retired? "When no longer needed" is not a retirement criterion. A date on a calendar is.

These six disciplines aren't new. They're the same management logic that makes human delegated work reliable: job description, access scoping, performance standards, regular review, exception paths, offboarding. The translation to AI agents is direct. Operating Model element 8 in the Anchor AI Bearing Framework formalizes them as a deployment requirement, not an afterthought.

You won't close all six gaps this week. Pick the one agent and the two or three gaps that pose the most immediate risk.

The VP Who Stopped Checking

Here's a composite that reflects a pattern that shows up regularly at the director and VP level.

A VP of Customer Operations launched an AI agent to draft first responses to Tier 1 support tickets. The demo was impressive: faster resolution times, consistent tone, tickets handled without routing delays. The agent went live. Three months passed.

Today the VP doesn't know whether the agent's answer templates have drifted from the customer service policy updated six weeks ago. Nobody has assigned the task of reviewing a sample of agent-drafted responses each week. The behavior when the agent encounters a billing dispute it isn't trained to handle is assumed but untested. The agent still holds the customer data access granted during the pilot, though the VP isn't sure whether narrowing it would break an integration the sales team started relying on last month. And if the agent's response to a product liability question creates a customer dispute, accountability isn't clear.

The technology worked. The management didn't.

This isn't hypothetical carelessness. It's what happens when deployment is treated as a finish line rather than a starting gate.

This Is a Management Problem

The instinct, especially for leaders who don't own the technical side, is to frame this as a tool problem: buy a monitoring dashboard, set up alerting, add a governance platform. Those tools are useful. A dashboard has a role once someone accountable owns the question. But a dashboard cannot substitute for an owner. It tells you something went wrong after it went wrong. Accountable management is the structure that catches drift before the alert fires.

You already know how to do this. If you hired a new analyst, gave them broad data access, a complex decision-making task, and no performance criteria, you'd expect problems by month three. The analyst would not be the issue. The missing management structure would be. Unmanaged delegated work drifts because delegation without structure drifts.

The first management gap most functions haven't closed is the distinction between owner and operator. Operators keep the system running. Owners are accountable for what the system produces, responsible for reviewing it, and empowered to change or retire it when something is wrong. That distinction is a management structure question, not a technical one. The companies getting governance right are the ones that settle it before the incident, not during it.

The Analogy That Actually Holds

The workforce analogy is useful precisely because it doesn't claim agents are people. It claims that the management disciplines that make delegated work reliable also make agents reliable. But the analogy doesn't have to stop with people. It also holds for the software your organization already trusts in production.

When an application gets deployed, mature teams don't send it into the world and forget it exists. They monitor performance. They review logs. They run CI/CD checks before changes move forward. They watch for security signals, dependency issues, failed jobs, policy violations, and degradation against expected behavior. Nobody says, "the application passed the demo, so we can stop checking it." Production is where the operational responsibility begins.

AI agents deserve the same treatment. They are software participants doing delegated work inside a workflow. A prompt is closer to a job description and a deployment contract than a one-time configuration. A permissions review maps to scoped data access and defined approval gates. A performance standard maps to sampled output quality, not just a status light. A weekly check-in maps to a scheduled operational review. An escalation path maps to exception handling, incident response, and rollback. Offboarding maps to retirement criteria and decommissioning.

That framing avoids the trap of treating agents as people while still taking their work seriously. You don't have to anthropomorphize an AI agent to manage it. You only have to acknowledge what your production systems already taught you: anything doing important work needs ownership, observability, quality checks, change discipline, security review, and a path out when it stops being useful.

Mainstream governance guidance is already treating agents this way. Microsoft's agent administration documentation treats agents as assets requiring controls for "visibility, access, distribution, and retirement," including permission review, blocking unsafe or noncompliant agents, reassigning ownerless agents, and deleting agents from inventory. The Cloud Security Alliance's Agent Identity Governance Framework applies the same logic in identity terms: agents need lifecycle management, credential and privilege controls, monitoring obligations, and decommissioning processes. These are not new management concepts. They are the same controls mature organizations already apply to deployed software.

What to Do This Week

Pick one agent your function has in production. Choose the one that's been running longest without a formal review, not the most complex one.

Ask six questions about it:

Who owns this agent, with accountability for what it produces and authority to change or retire it?
What permissions does it hold, and are those still appropriate for its current scope?
What does good output look like, and who reviewed actual output last week?
When was the last scheduled management review, and when is the next one?
What happens when this agent encounters something it shouldn't handle?
When does this agent get retired, and what's the criterion?

If you can't answer these from memory, you have a management gap, not a technology gap.

The AI Agent Management Checklist at anchor-enterprise.com/ai-agent-management-checklist.html operationalizes these six areas into 18 review questions. Run it on this agent.

The Work That Actually Matters

Your function's first fix is not a Center of Excellence. A centralized team can set standards, but it cannot own what every agent in every function produces. That ownership has to live where the work happens.

What your function needs is readiness at the level where the agents run: an owner for every agent, written performance criteria, a review schedule with decision-making authority attached to it, and a retirement plan that isn't "when we get around to it."

If an agent is doing work for your function, someone owns its output this week. If you can't name that person, that is the first gap to close, and the rest follows from there.

Download the AI Agent Management Checklist and run it on one agent this week. When you're ready to work through your full function's agent inventory, book a 45-minute function-level readiness call. One function, one structured review, grounded in what's actually running.

Sources: Deloitte 2026 State of AI (agent governance maturity, agentic AI deployment plans). Microsoft Learn: Governance and Lifecycle actions for agents (learn.microsoft.com/en-us/microsoft-365/admin/manage/agent-actions). Microsoft Learn: Manage agents in Microsoft 365 admin center (learn.microsoft.com/en-us/microsoft-365/admin/manage/manage-copilot-agents-integrated-apps). Cloud Security Alliance Agent Identity Governance Framework (labs.cloudsecurityalliance.org/research/agentic-identity-governance-framework-v1/). Anchor AI Bearing Framework, Question 5 and Operating Model element 8.