The Real AI Test: How to Tell a Platform from a Chat Overlay

Artificial IntelligenceAI RiskAI Governance

Oct 14

Most vendors now claim to have “AI platforms,” but many are just chat interfaces placed on top of disconnected systems. The difference is more than marketing. Without the right controls, these overlays can leak data, bypass policies, and mislead buyers into thinking they are getting enterprise-grade AI governance when they are not.

The Core Question

When evaluating any vendor, start with one question:

“Do you enforce tenant-scoped retrieval with a policy check before every tool or dataset call?”

That single test exposes whether the platform truly governs data and actions or merely wraps them in a conversational UI.

What Tenant-Scoped Retrieval Means

Tenant-scoped retrieval ensures that data pulled by an AI agent belongs only to the requesting organization.
Policy checks before every call mean that each query, retrieval, and tool invocation is verified in real time, not just when a user logs in.

If a vendor answers vaguely, “we have RBAC” or “data is separated by project”. It is likely an overlay, not a governed platform.

Why It Matters

Leak prevention: Stops cross-customer or cross-business-unit data exposure.
Auditability: Every prompt, retrieval, and output can be traced and reviewed.
Least privilege: Policies control access at the embedding, cache, and output level.
Consistency: The same enforcement applies across APIs, web apps, and AI agents.

Red Flags: Five Signs of an Overlay

Policy checks happen only at login, not during use.
Shared vector databases use filters instead of hard namespaces.
Internal tools bypass policy enforcement.
Caches reuse embeddings across tenants.
Policies default to open rather than deny.

What “Good” Looks Like

Hard isolation: Separate indexes, namespaces, and encryption keys for each tenant.
Runtime policy engines: Every retrieval or tool call triggers an explicit decision.
Signed audit trails: Logs capture each step from prompt to output.
Attribute-based access control: Policies travel with data, limiting who sees what.
Output filtering: Automatic masking or redaction before responses reach users.

Ten Questions to Add to Every RFP

How do you isolate embeddings, caches, and indexes per tenant?
Do you run a policy check on every retrieval or tool call?
Can you show a full audit log of one agent run?
Are policies based on both user and data attributes?
How do you prevent cached content from crossing tenants?
Can you block tools or data domains dynamically?
What happens if the policy service fails—do you fail closed?
How are fine-tuned models and RAG corpora isolated?
Do you enforce deny-by-default with explicit allow rules?
Can you provide an architecture diagram showing all policy checkpoints?

The Wheelhouse View

Real AI governance starts at runtime. Buyers who rely only on marketing labels risk adopting systems that look intelligent but act blind to policy. Asking a single question about tenant-scoped retrieval often reveals the truth faster than any demo.

For more insights, visit www.wheelhouseadvisors.com.

Artificial IntelligenceAI RiskAI GovernanceWheelhouse Advisors

Samantha "Sam" Jones

Samantha “Sam” Jones is the lead research analyst for the IRM Navigator™ series and a core contributor to The RiskTech Journal and The RTJ Bridge. As a digital editorial analyst, she specializes in interpreting vendor strategy, market evolution, and the convergence of technology with enterprise risk practices.

As part of Wheelhouse’s AI-enhanced advisory team, Sam applies advanced analytical tooling and editorial synthesis to help decode the structural changes shaping the risk management landscape.

The Real AI Test: How to Tell a Platform from a Chat Overlay

AWS Outage, What Happened And How To Prepare With Integrated Risk Management

Petri and the Rise of Autonomous Risk Auditing