Run AI Evaluations manually
- Updated: 2025/11/17
For more flexibility, users can manually assess each output.
Procedure
-
Evaluations can be started from two entry points.
- From the AI Evaluations page select Run Evaluation.
- From the AI Skill page, select Evaluate > Run Evaluation. This will automatically populate the next step.
You are navigated to a new page to configure the evaluation. - Select the skill to evaluate.
- Click Next.
- Select the method Evaluate manually to enter your judgment for each output.
-
Add your data set
- Select Upload file to enter a Name and choose the file to used in the evaluation. The file should be in CSV format with a maximum size of 100 kb.
- Select Use existing data to pick a data set that has previously been uploaded.
- Select Enter data manually to Name and type the variables and expected outputs for this configuration.
-
Click Run evaluation.
The evaluation saves your data and begins running. Processing can take some time, depending on the size of data in the evaluation. Upon completion, a notification is sent to you including a link to the evaluation.
- Navigate to the evaluation through the Evaluation tab or by clicking the link in the notification.
-
For each output, click thumb-up or thumb-down to judge the performance.
The evaluation tracks these judgments to provide an overall score, available after completion.
- After completing your assessment, you can view these results by navigating to the evaluation from the Evaluation page.