Framework 02

How We Build Guardrails

Specific, named mechanisms RSUA uses to make AI systems safe to deploy and easy to defend.

Guardrails are not a single control

A production AI system has guardrails at five layers: input, retrieval, generation, action, and audit. Each layer catches different failure modes. Skip any layer and the others can be defeated.

The five-layer guardrail stack

01

Input layer

Filters and classifiers on incoming prompts. Detect prompt injection, off-topic requests, and abusive content before they reach the model.

02

Retrieval layer

Access controls on what data the AI can see. Row-level permissions, document-level redaction, and tenant isolation enforced at retrieval.

03

Generation layer

Constraints on what the model produces. Output schemas, citation requirements, and refusal patterns for out-of-scope queries.

04

Action layer

Gates on what the AI can actually do. Confidence thresholds, human approval for high-stakes actions, and hard limits on irreversible operations.

05

Audit layer

A complete record of every meaningful decision: input, retrieval, model used, output, action taken, and reviewer.

Red teaming is not a launch milestone. It is a recurring practice. Every meaningful change to prompts, tools, or data surfaces deserves adversarial review.

Continue Reading

Framework 03

Build vs. Buy vs. Partner

Want this applied to your business?

Start a conversation