The Real AI Test: How to Tell a Platform from a Chat Overlay
Most vendors now claim to have “AI platforms,” but many are just chat interfaces placed on top of disconnected systems. The difference is more than marketing. Without the right controls, these overlays can leak data, bypass policies, and mislead buyers into thinking they are getting enterprise-grade AI governance when they are not.
The Core Question
When evaluating any vendor, start with one question:
“Do you enforce tenant-scoped retrieval with a policy check before every tool or dataset call?”
That single test exposes whether the platform truly governs data and actions or merely wraps them in a conversational UI.
What Tenant-Scoped Retrieval Means
Tenant-scoped retrieval ensures that data pulled by an AI agent belongs only to the requesting organization.
Policy checks before every call mean that each query, retrieval, and tool invocation is verified in real time, not just when a user logs in.
If a vendor answers vaguely, “we have RBAC” or “data is separated by project”. It is likely an overlay, not a governed platform.
Why It Matters
Leak prevention: Stops cross-customer or cross-business-unit data exposure.
Auditability: Every prompt, retrieval, and output can be traced and reviewed.
Least privilege: Policies control access at the embedding, cache, and output level.
Consistency: The same enforcement applies across APIs, web apps, and AI agents.
Red Flags: Five Signs of an Overlay
Policy checks happen only at login, not during use.
Shared vector databases use filters instead of hard namespaces.
Internal tools bypass policy enforcement.
Caches reuse embeddings across tenants.
Policies default to open rather than deny.
What “Good” Looks Like
Hard isolation: Separate indexes, namespaces, and encryption keys for each tenant.
Runtime policy engines: Every retrieval or tool call triggers an explicit decision.
Signed audit trails: Logs capture each step from prompt to output.
Attribute-based access control: Policies travel with data, limiting who sees what.
Output filtering: Automatic masking or redaction before responses reach users.
Ten Questions to Add to Every RFP
How do you isolate embeddings, caches, and indexes per tenant?
Do you run a policy check on every retrieval or tool call?
Can you show a full audit log of one agent run?
Are policies based on both user and data attributes?
How do you prevent cached content from crossing tenants?
Can you block tools or data domains dynamically?
What happens if the policy service fails—do you fail closed?
How are fine-tuned models and RAG corpora isolated?
Do you enforce deny-by-default with explicit allow rules?
Can you provide an architecture diagram showing all policy checkpoints?
The Wheelhouse View
Real AI governance starts at runtime. Buyers who rely only on marketing labels risk adopting systems that look intelligent but act blind to policy. Asking a single question about tenant-scoped retrieval often reveals the truth faster than any demo.
For more insights, visit www.wheelhouseadvisors.com.