UI Agents are Automation Anywhere's fully autonomous reasoning engine for building and executing reliable unattended UI automations, from a natural language prompt.

Overview

While RPA can script clicks, UI Agents (or Computer Use) let us describe outcomes by replacing selectors and rigid flows with goal-driven plans that adapt to UI changes and can perform complex reasoning at run time.

This flexibility makes UI Agents perfect for teams to modernize brittle and complex automations without rewriting systems.

Benefits

Easy to build and maintain: UI Agents are essentially goal-based AI Agents that are designed to run on a browser. UI Agents take a goal written in natural languageas an input and directly execute that goal in the target application. As a result, they are very easy to build and even easier to maintain.

Adaptive and Resilient: UI Agents don’t rely on layout-specific scripts. They understand the page state, reason about the information presented, and decide what to do next - so automations keep working as websites change and can scale across multiple sites with minimal rework.

Automate end-to-end workflow navigation: We have also designed these UI Agents to co-exist with RPA actions. That means you easily automate your end-to-end process across browser and non-browser steps, all within the same editor.

How does it work


Image demonstrates the workflow for the UI Agents.
UI Agents takes a natural language goal as input, and then automatically launches the target website.

First, it observes the current state of the page and then creates a plan for the goal given to it. It then executes those planned actions on the browser, and checks whether user goal is completed or more actions are needed. If more actions are needed, then it again observes the state of the website, creates a new plan, executes it, and checks the output.

The agent will continue to run this loop until the user goal is fully completed.

Key features and value proposition

Let's distill some of the key features of the UI Agents:

  • It leverages large action models (LAM) that have a deep understanding of website navigations, which gives it high reliability and accuracy.
  • It is also seamlessly integrated into the same automation editors, thereby cutting the learning curve and making it easy to build complex automations from the same interface.
  • It also has embedded governance and guardrails so that you can securely execute them.
  • And finally, we vet each action model for reliability, resiliency and accuracy before making it available to you.

A key outcome you get from all of this is faster time-to-value, where you can very quickly build your automations; and you can unlock new use cases and scenarios that were previously very difficult to automate.

Prerequisites

Ensure your system meets with the following guidelines.
  • Licensing: Bot Creator, Citizen Developer, Attended bot runner, Unattended bot runner.
  • Browser: Google Chrome (latest version)
  • Operating System: Windows or MacOS
  • Configure the Large Action Model: See, Set up Narada for UI Agents

Related Links: