AI Guardrails provides an inline interception mechanism that enforces security and compliance policies, protecting sensitive data and ensuring ethical AI practices.

It employs a smart tokenization to identify sensitive data used within prompts to replace them with tokenized values. Similarly it intercepts the model responses to reconstruct and replace the tokenized values, ensuring the relevance of the response. Furthermore, it monitors the toxicity levels of prompts and model responses to audit toxicity levels.

When configuring AI Guardrails in Automation 360, setting up data masking rules and understanding toxicity monitoring are crucial. These features define how the system treats different types of sensitive data and assesses the appropriateness of language used in interactions with LLMs, preventing potential issues during bot execution. This topic provides insights into the implementation and functionality of AI Guardrails, emphasizing their role in promoting data security and responsible AI practices.


AI guardrails

At the core of AI Guardrails is the data masking feature, which operates by identifying and substituting sensitive data elements within a bot's prompt with tokenized values before transmitting the request to the Large Language Model (LLM). This process safeguards sensitive information from being directly processed by the LLM while preserving the context necessary for generating an accurate response. Additionally, the toxicity monitoring capability monitors the prompts sent to LLMs and the responses generated by them for potentially harmful language.

Benefits

Data masking
Data masking is an obfuscation technique to identify known sensitive data and replace them with fictitious values.
Categories of sensitive data for masking

To ensure robust data protection, it's crucial to identify and mask sensitive information effectively. AI Guardrails helps you to establish precise data masking rules tailored to the following critical categories: Personally Identifiable Information (PII), Protected Health Information (PHI), and Payment Card Industry Data (PCI).

These categories enable the organization and application of consistent masking behavior to particular types of sensitive data. For instance, you might opt to irreversible anonymize all PCI data to prevent its storage or use in any form, while choosing reversible masking for PII data to maintain functionality.

By adopting these strategic approaches, organizations can safeguard sensitive data, comply with regulations, and mitigate potential risks.

Configuring masking behavior

Within Automation 360 AI Guardrails, you have the flexibility to determine how each data category is handled:

  • Mask: A reversible process where sensitive data is temporarily replaced with a token. The original data is retrieved and reinstated in the LLM's response before being presented to the user.
  • Anonymize: An irreversible process that permanently replaces sensitive data with a token. The original data is not stored or used for re-constructing the final response, making it suitable for scenarios with strict data retention prohibitions.
  • Allow: For specific use cases requiring access to sensitive data, you can choose to allow the data to be sent to the LLM in clear text.

By default, the system applies Mask if no specific behavior is selected, ensuring a baseline level of protection for all sensitive data. You can configure these rules and assign them to designated folders, ensuring that any bot operating within a folder with an assigned rule automatically implements the defined masking behavior.

Toxicity monitoring

AI Guardrails analyze both the prompts sent to LLMs and the responses generated by LLMs for potentially harmful language, classifying them as low, moderate, or high toxicity. Although currently in an "observe only" mode, this feature allows for the identification of potentially problematic language use. Future releases will include the ability to block prompts based on their toxicity level.

Monitoring and logging

Automation 360 logs all guardrail actions, including details of the data masking process. This comprehensive logging provides an audit trail, empowering administrators to monitor the functionality of the AI Guardrails and verify compliance with data protection policies.

Licensing requirements

To activate and use AI Guardrails service for enforcement, you need to purchase a consumption SKU - AI Guardrails (Number of LLM Prompts) along with the Enterprise Platform license. See, Enterprise Platform.
Note: AI Guardrails credits from your purchased volume are consumed when executing prompts from automations or AI Skills in a public workspace, or when testing in the AI Skills editor within a private workspace.