From tightly supervised agents to those operating with near independence, the degree of autonomy afforded to AI agents is a strategic decision that will shape reliability, efficiency, and risk.
Three Modes of Oversight
In this week's all-company AI briefing, we discussed three potential models we could adopt:
- Human in the loop: AI acts step by step, with constant human intervention. Safe, but limited in scale.
- Human in the middle: AI executes larger tasks with human review at the end. Efficient, but dependent on clear guidance.
- Autonomous agents: AI operates with broad freedom, demanding well-designed rules and contexts up front.
We have found that moving along this spectrum is less about writing clever prompts, and more about constructing environments (contexts) that make autonomy safe and effective. We've found success in positioning ourselves in the middle, observing that the more time we invest upfront in context engineering the more efficient our review.
Reusability and Measurement
As we test and learn, we're building a reusable prompt library, embedding systematic evaluation into our process, and creating shared oversight frameworks to codify the things that generate results, turning AI from a novelty into a repeatable system. This makes it possible to measure ROI, improve accuracy, and scale adoption responsibly.
Strategic Implication
The spectrum of autonomy isn’t just a technical framework, it’s a leadership one. If we can calibrate the right level of agency, design meaningful contexts, and institutionalize reuse we'll stand a good chance of harnessing AI as a true teammate.