Advanced Classifier package
- Updated: 2024/07/29
Advanced Classifier package
The Advanced Classifier package enables you to classify documents into a folder structure. You can upload the documents from the folders to the respective learning instance for content extraction in Document Automation.
- A valid Skilja (3rd Party provider) license is mandatory for utilizing the Advanced Classifier package and the associated actions to process documents. For more information, see Configuring bring your own key (BYOK) for Advanced Classifier package.
- You can contact
contact@scalehub.com
to obtain the license details. - Download the Advanced Classifier package from the A-People Download portal A-People Downloads page (Login required). To get this package, click a specific Automation 360 IQ Bot version > Installation Setup and download the package. For example, bot-command-advanced-classifier-<version>.jar.
- You must upload this package to the Control Room. For more information, see Add packages to the Control Room.
- When you use the Advanced Classifier package
actions, ensure that the input and output paths do not include the
following:
- Special characters in a sequence. For example, C:\Documents and Settings\user1\My Documents\AdvanceClassifier-_@#!^&()=+-~`][12.
- The folder names used in the paths do no include the following characters: , ' #.
Actions in the Advanced Classifier package
The package contains the Train Advanced Classifier, Classify Document, Classify Pages, and Split Document actions. You can use these actions to create a model file, and use the file to classify uploaded documents into different folders. These actions work as a precursor to document processing.
Actions | Description |
---|---|
Train Advanced Classifier | Use the Train Advanced Classifier action to create a model file that is used by the Classify Document, Classify Pages, or Split Document actions to sort the documents into required categories for input. For more information, see Using the Train Advanced Classifier action. |
Classify Document | The Classify Document action groups the input documents based on the first page of each document, using the selected model file that is created with the Train Advanced Classifier action. For more information, see Using the Classify Document action. |
Classify Pages | The Classify Pages action groups the pages of an input document based on the model file that was created using the Train Advanced Classifier action and filters out the pages that do not fit the model. For more information, see Using the Classify Pages action. |
Split Document | Use the Split Document action to separate the input document into multiple documents based on the selected classification model. For more information, see Using the Split Document action. |
The Advanced Classifier package leverages Tesseract OCR for image-based classification. For an extensive list of languages supported by Tesseract OCR, see Tesseract OCR Supported Languages.