A fintech agent approved flagged transactions after white-text instructions hidden in a PDF bypassed its guardrails. Input sanitization before the model context matters more than output filtering after.
← All categories
Guardrails
Everything AIVIO has published under Guardrails — Decks, Signal, and In Depth.