Using the Train Advanced Classifier action
- Updated: 2025/03/03
Use the Train Advanced Classifier action to create a model file that is used by the Classify Document, Classify Pages, or Split Document actions to sort the documents into required categories for input.
Prerequisites
Note: TIFF files are not supported
when you train your models using ABBYY OCR. Including TIFF files in your training folder
might lead to unintended outcome, such as data loss or missing files in your training
folder. We recommend that you convert TIFF files to PDF and then train your
model.
Before building the bot, collect example documents and categorize them into folders.
Ensure the set of example documents meets the following requirements:
- Has at least two categories.
- A minimum of 15 files per category is required, with a recommendation of 20 files per category.
- There are no restrictions on the maximum number of categories. However, it is important to note that as the training data set and the corresponding model size increase, the performance of the classification process can decline. Therefore, it is advisable to keep the number of categories within a range of 150 per model file for optimal performance.
- The supported file formats are as follows:
- .tiff
- .bitmap
- .jepg
- .png
- .txt
- We recommend that you provide images with a resolution of 300 dpi (dots per inch). The minimum acceptable resolution is 200 dpi.
Note:
If these minimum requirements are not met, an error message is displayed during bot run-time.
Procedure
Next steps
After creating the model, build a bot to classify input documents. For more information, see Using the Classify Document action.