AI Guardrails
- Updated: 2025/03/03
AI Guardrails provides an inline interception mechanism that enforces security and compliance policies, protecting sensitive data and ensuring ethical AI practices.
It employs a smart tokenization to identify sensitive data used within prompts to replace them with tokenized values. Similarly it intercepts the model responses to reconstruct and replace the tokenized values, ensuring the relevance of the response. Furthermore, it monitors the toxicity levels of prompts and model responses to audit toxicity levels.
When configuring AI Guardrails in Automation 360, setting up data masking rules and understanding toxicity monitoring are crucial. These features define how the system treats different types of sensitive data and assesses the appropriateness of language used in interactions with LLMs, preventing potential issues during bot execution. This topic provides insights into the implementation and functionality of AI Guardrails, emphasizing their role in promoting data security and responsible AI practices.

At the core of AI Guardrails is the data masking feature, which operates by identifying and substituting sensitive data elements within a bot's prompt with tokenized values before transmitting the request to the Large Language Model (LLM). This process safeguards sensitive information from being directly processed by the LLM while preserving the context necessary for generating an accurate response. Additionally, the toxicity monitoring capability monitors the prompts sent to LLMs and the responses generated by them for potentially harmful language.
Benefits
- Data masking
- Data masking is an obfuscation technique to identify known sensitive data and replace them with fictitious values.
- Toxicity monitoring
-
AI Guardrails analyze both the prompts sent to LLMs and the responses generated by LLMs for potentially harmful language, classifying them as low, moderate, or high toxicity. Although currently in an "observe only" mode, this feature allows for the identification of potentially problematic language use. Future releases will include the ability to block prompts based on their toxicity level.
- Monitoring and logging
-
Automation 360 logs all guardrail actions, including details of the data masking process. This comprehensive logging provides an audit trail, empowering administrators to monitor the functionality of the AI Guardrails and verify compliance with data protection policies.