Details of using all three actions (Start, Run, and End session) in the UI Agents package for a complete UI Agent.

Create a new Taskbot and use the actions from the UI Agents package. Three actions from the package are needed to complete the UI Agent build.

Procedure

Start Session: This should be the first action when building an UI Agent. This action creates and launches a browser session in which the agent will run.

  1. Choose your preferred Action Model.
  2. Provide the credentials for the chosen action model in Key.
  3. Optionally, use More options > Proxy to launch the browser session via a proxy.
  4. Provide a unique Session name for the browser where the UI Agent is running.

Run: Configure the task in this action.

  1. Enter your unique Session name from the Start Session action.
  2. Optionally, enter the website that the agent needs to open as the Starting URL.
  3. Enter the agent Goal. See, How to prompt UI Agents.
  4. Optionally, configure Secure variables to obfuscate sensitive values from the action model.
  5. Optionally, specify the datatypes for the output of the agent in Output format JSON. By default, agent output will be in string datatype.
  6. Enter a Timeout for the Run action. The task is terminated when the timeout occurs. Adjust this based on complexity of the task.
  7. You must provide the Output file path and name where the agent will log metadata, action trace, and output (if applicable).
    The file must pre-exist and the file-path must be valid. Contents of this file are overwritten ever time a Run action is executed.
    Note: If the file does not exist, the agent will create it. Valid file extensions are JSON and TXT (example: C:\UIAgentLog.json or C:\UIAgentLog.txt). JSON is recommended for readability.
  8. Optionally, save the context of the output by selecting a variable for Assign output to variable.

End Session: Close the UI Agent browser session.

  1. Enter the Session name to be closed.