How to prompt UI Agents
- Updated: 2026/03/16
Suggestions to consider for language used to describe your goal when prompting for UI Agents.
The simplest way to use UI Agents to achieve an end-to-end task is by clearly specifying the goal, rules to follow, and any necessary information, to guide the agent. Remember, the agent must take many steps sequentially to achieve the goal, and unclear goals can misalign the workflow and results.
UI Agents work most reliably when a goal can be accomplished in fewer than 10 steps. If a goal requires more steps, break it down into multiple smaller goals; each goal can then be sent to the agent. See, Chaining multiple tasks.
It is important to consider the specificity of a task for any UI Agent where
repeatability is key. A direct relationship exists between reliability of an agent and
how specific the goal must be for a task. For best results, describe any nuance to
perform the task in your goal.
- State the objective up front: One sentence on what “done” looks like. (e.g. Your goal is to….)
- Be direct about what to do and what not to do: Include boundaries (e.g. don’t submit payments, don’t delete records).
- Provide complete inputs and definitions: Specify exact names/IDs, date ranges, and what ambiguous terms mean (e.g. top customers = top 20 by ARR).
- Break big goals into ordered steps: Keep each step testable; avoid bundling research + action + communications in one prompt.
- Specify outputs and format: Free-text vs. JSON, required fields, and how to label missing data.
- Add rules: Specific business rules (e.g. if asked for contact, use email address). Tell it when to pause (missing data, paywalls, MFA, captcha, errors).
- Include fallback paths. What to do if an UI element is not found (use search, alternate nav, second attempt, then stop).
Contrasting prompt examples
| Do | Do | Do not | Do not |
|---|---|---|---|
|
|
|
|