Agency-Control Trade-off

The agency-control trade-off is the fundamental design tension in agentic AI systems: every increment of autonomy granted to an agent is a corresponding decrement of human oversight and control.

Articulated by Aishwarya Naresh Reganti and Kiriti Badam as one of two foundational differences between AI products and traditional software.

The principle

In traditional software, the human is always in control: every outcome is the result of a deliberate action (clicking a button, submitting a form). The system’s decision engine is deterministic and fully specified by the product.

In agentic AI systems, the agent can take actions on behalf of the user. As agency increases:

The agent makes more decisions without human review.
Errors compound across a longer action sequence before a human can intervene.
The blast radius of a failure grows.

The trade-off is therefore not a binary choice (agent vs. no agent) but a continuous dial. Where you set that dial should be a deliberate product decision grounded in demonstrated reliability, not aspiration.

The graduation ladder

The recommended approach is to start with maximum human control and low agency, then graduate to higher autonomy as the system demonstrates reliability:

Stage	Agency	Control	Failure mode caught by
V1 — suggest	Low	High	Human reviewer before any action
V2 — draft	Medium	Medium	Human editor; edits logged as signal
V3 — act	High	Low	Post-hoc monitoring; rollback mechanisms

Each stage provides data for the next. Logging human edits at V2 provides near-free training signal. Attempting V3 without V1 and V2 means calibrating against an unknown distribution with no safety net.

Why teams skip the ladder

Competitive pressure (“our competitors are all building agents”).
Underestimating the long tail of failure modes — which are only visible in production.
Misunderstanding what “autonomous” means in enterprise contexts (messy taxonomies, undocumented rules, legacy data).

Aishwarya’s account: a customer support agent they built was shut down after launch because the team was doing so many hot fixes there was no way to enumerate all emerging problems. This is the canonical failure mode of starting at V3.

How much agency is right?

Agency level should be determined by:

Risk per action — refunding a purchase is high risk; routing a support ticket is low risk. Start with low-risk, high-volume actions.
Reliability evidence — how stable is the behaviour distribution from calibration cycles? When new cycles produce diminishing new surprises, consider graduating.
Rollback capability — can bad actions be undone? If not, require human approval.

There is no universal right answer. The decision should be made by PMs, engineers, and subject matter experts together, looking at real usage data.

Where mainstream views differ