AI Governance v.40 release
- Updated: 2026/03/30
What's new
|
AI Evaluations deliver governed, actionable
performance insights for Agents and Skills
AI Evaluations introduces controlled, metered evaluation of AI Agents and AI Skills with licensing and AI credit consumption tied to entitlement tracking and enforcement for Cloud environments. This capability ensures teams can validate and benchmark AI performance with automated evaluation built into the AI agent development lifecycle. Licensed users have access to the evaluation feature and the automated scores and details via new evaluation pages, available under the AI menu. See, AI Evaluations. Available for Cloud environments only. • Entitlement & Usage Controls: Requires appropriate licensing (APA Essentials or APA Pro) and AI credits with usage tracking and enforcement. • Automatic & Manual Tools: Built-in support for automatic and manual evaluations using predefined metrics for measuring performance and scoring details. • Detailed Insights: Scores are backed by industry and research metrics, with breakdowns that illuminate expected vs. actual interactions, execution sequences, and behavior patterns. •
Flexible Dataset Support: Upload, reuse, or
manually define datasets with secure, audit-aligned
retention for repeatable evaluation cycles. Max file size is
50 MB. Datasets are retained for 1 year (reset on use).
Note: Upload only available when
evaluating AI Skills.
AI Evaluations helps teams optimize quality, reliability, and governance of AI-powered automations and agentic processes before production deployment and post-deployment. |
|
Perform
AI Evaluations for AI Skills and
AI Agents and view insights in Detailed
Evaluation view
The Run Evaluation flow now supports AI Agents. Users can invoke Agent evaluations by using the Evaluations page, or directly from the Agent Editor. You can also view You can view the
evaluations from the Agent Editor and the Evaluations
landing page. A summary is available for overall Evaluation.
Closer investigation is available by selecting evaluation
details on the page. This provides a summary of the scores
of the executed data set. Detailed view for each data set
execution is available through Agent output details. Some of
these details include:
|
|
Event
logs and data retention policy for AI Evaluations
When AI Evaluations is run, an Event log is created in AI Governance for audit purposes. Data from AI Evaluations includes date and user info for security and control over versions and modifications. Storage and retention of this data adhere to the existing retention policy as per our platform framework. See, 数据保留策略. |
|
AI
Agent Audit logs now available in AI
Governance
Complete visibility and traceability of AI
Agent activities and interactions with LLM models for
governance and compliance auditing. Ensures compliance with
security policies and responsible AI governance requirements
through comprehensive audit trails.
|
What's changed
|
Expanded AI Governance logging for system prompts with Toxicity visibility AI Governance now captures system prompt details and toxicity scores in Prompt logs and Event logs, even when user prompts are blocked by AI Guardrails. When either system or user prompts exceed configured thresholds per guardrail policy for toxicity, the block is applied, and both system and user prompt toxicity levels are recorded in logs. This enhanced visibility clarifies why prompts were blocked and supports scoring and analysis of system prompt toxicity alongside user inputs, improving auditability and alignment with guardrail policies for safer and more transparent automation behavior. |
|
AI Guardrails masking functionality now supports additional entities and expanded regional language Enhancements strengthen data loss prevention (DLP) controls by broadening entity coverage and enabling reliable masking across additional global languages. Masking and unmasking operations are fully functional across all three sensitive data categories (PII, PCI, PHI). See full list, AI 中的数据屏蔽. AI Guardrails now supports masking and unmasking for the following languages: Russian, Hindi, Japanese, Korean, Mandarin (Traditional Chinese), and Portuguese. |
Fixes
| AI Prompt logs display beyond 1000 records in AI Governance, as expected. Previously, records would not load. |
Limitations
| In Arabic, masking is partially supported. Some entities might not be detected or masked consistently. |