The fastest method to run an evaluation is by performing it automatically. This
leverages LLM as a judge to compare data, rather than human effort.
Procedure
Evaluations can be started from two entry points.
From the AI Evaluations page select Run
Evaluation.
From the AI Skill page, select Evaluate > Run Evaluation. This will automatically populate the next step.
You are navigated to a new page to configure the
evaluation.
Select the skill to evaluate.
Click Next.
Select the method Evaluate automatically to use LLM as a
judge and NLP metrics in the
evaluation.
Add your data set
Select Upload file to enter a Name and choose the file to use in
the evaluation. The file should be in CSV format with a maximum size of 100
kb.
Select Use existing data to pick a data set that
has previously been uploaded.
Select Enter data manually to provide a name and create a data
set manually by providing input variables and optional expected
outputs.
Click Run evaluation.
The evaluation saves your data and begins running. Processing can take some time,
depending on the size of data in the evaluation. Upon completion, a notification
is sent to you including a link to the evaluation.
Navigate to the results through the Evaluation tab or by
clicking the link in the notification.