Many teams believe they have mitigated AI risk because they’ve written clear policies.
The problem is that policies are static, while context is dynamic.
An AI assistant evaluates instructions at runtime, influenced by user input, retrieved documents, and tool responses. Each of these can introduce new constraints or new intent - sometimes unintentionally.
A policy that works at the prompt layer may not hold once retrieval or tools are introduced. This isn’t a policy failure. It’s a mismatch between where rules are defined and where behavior emerges.
As assistants grow more capable, intent becomes distributed. No single prompt or policy captures the full decision surface.
This is why “we told it not to” is not a control. It’s a hope.
Effective control requires understanding how intent propagates across the entire system - not just where it was originally declared.