Governance Practice

What Is Guardrails?

Guardrails is technical and procedural controls that constrain an AI system's behaviour to keep its outputs and actions within acceptable bounds.

Definition

Guardrails, technical and procedural controls that constrain an AI system's behaviour to keep its outputs and actions within acceptable bounds.

AI guardrails operate at multiple layers: input filtering (blocking harmful prompts), output filtering (blocking harmful responses), system prompts (instructions constraining behaviour), and runtime monitoring. For agentic AI, guardrails also include action constraints, limiting what real-world actions the system can take without human approval. Guardrails are a defence-in-depth measure, not a guarantee; they can be bypassed through prompt injection and jailbreaking, which is why they form one layer of a broader control framework.

Source: NIST AI 600-1; OWASP LLM Top 10

Plain-language explanation

Primary source: NIST AI 600-1; OWASP LLM Top 10

Related insights

Australia's AI Governance Gap: What the Regulatory Retreat Means for Enterprise Risk

12 min read · Regulation

Australia's AI Safety Standard: What It Actually Requires and Who It Applies To

10 min read · Australia

Australia's Guidance for AI Adoption (AI6): The Six Essential Practices Replacing the 10 Guardrails

10 min read · Australia

See where you stand on AI governance

Take the free 7-question maturity assessment and get a personalised action plan.

Free assessment, 3 minutes →