LLM Guardrails and Safety Patterns

July 19, 2025 · 2 min read

High Performance Developer

Notes on AI/LLM guardrails and safety patterns from a book on "Agentic Design Patterns" by one of Google's Distinguished Engineers, Antonio Gulli.

walking

Guardrails and Safety Patterns

Provide a protective layer to guide agent behavior in order to prevent "harmful, biased, irrelevant or otherwise undesirable responses".

Other reasons for guardrails:

Legal
Compliance

Guardrails can be implemented using "lower power" but faster LLM models. They should be teamed up with observability to understand when they're being triggered, false positives and general user behavior.

Where to Apply Guardrails

Input validation - filter malicious content and prevent jailbreak attacks.
Prompt or behavioural constraints - directly instructing the LLM, explicitly preventing or allowing tool use.
External moderation APIs, "human in the loop" or other LLMs options for output validation.

Guardrail Prompts

A general purpose safety prompt - company policy
Permissible input prompt
A structured output definition prompt
Policy determination by a prompt (input or output validation, what policy does it break?)
Technical guard rail prompt to verify the output of other prompts
Jailbreak prompt

Links

https://docs.google.com/document/d/1rsaK53T3Lg5KoGwvf8ukOUvbELRtH-V0LnOIFDxBryE/edit?tab=t.0#heading=h.pxcur8v2qagu - "Agentic Design Patterns: A Hands-On Guide to Building Intelligent Systems" by Antonio Gulli

Guardrails and Safety Patterns​

Where to Apply Guardrails​

Guardrail Prompts​

Links​

Guardrails and Safety Patterns

Where to Apply Guardrails

Guardrail Prompts

Links