Advanced Classifier package

The Advanced Classifier package enables you to classify documents into a folder structure. You can upload the documents from the folders to the respective learning instance for content extraction in Document Automation.

注:
  • A valid Skilja (3rd Party provider) license is mandatory for utilizing the Advanced Classifier package and the associated actions to process documents. For more information, see 高度な分類子パッケージの BYOK (Bring Your Own Key) を設定する.
  • You can contact contact@scalehub.com to obtain the license details.
  • You can download the Advanced Classifier package from IQ Bot downloads link.
  • You must upload this package to the Control Room. For more information, see Control Room に packages を追加.
  • When you use the Advanced Classifier package actions, ensure that the input and output paths do not include the following:
    • Special characters in a sequence. For example, C:\Documents and Settings\user1\My Documents\AdvanceClassifier-_@#!^&()=+-~`][12.
    • The folder names used in the paths do no include the following characters: , ' #.
    You will see an error when you use such paths in the Advanced Classifier package actions and execute in a bot.
.

Actions in the Advanced Classifier package

The package contains the Train Advanced Classifier, Classify Document, Classify Pages, and Split Document actions. You can use these actions to create a model file, and use the file to classify uploaded documents into different folders. These actions work as a precursor to document processing.

Actions Description
Train Advanced Classifier Use the Train Advanced Classifier action to create a model file that is used by the Classify Document, Classify Pages, or Split Document actions to sort the documents into required categories for input. For more information, see [高度な分類子をトレーニング] actionの使用.
Classify Document The Classify Document action groups the input documents based on the first page of each document, using the selected model file that is created with the Train Advanced Classifier action. For more information, see [ドキュメントの分類] actionの使用.
Classify Pages The Classify Pages action groups the pages of an input document based on the model file that was created using the Train Advanced Classifier action and filters out the pages that do not fit the model. For more information, see [ページの分類] actionの使用.
Split Document Use the Split Document action to separate the input document into multiple documents based on the selected classification model. For more information, see [ドキュメントを分割] アクションの使用.
注:

The Advanced Classifier package leverages Tesseract OCR for image-based classification. For an extensive list of languages supported by Tesseract OCR, see Tesseract OCR Supported Languages.