Using the Classify Pages action

The Classify Pages action groups the pages of an input document based on the model file that was created using the Train Advanced Classifier action and filters out the pages that do not fit the model.

Prerequisites

  • If not done already, use the Train Advanced Classifier action to create a model file.
  • Ensure the input files are in required format.

Build a bot with the Classify Pages action within a Loop action to iteratively classify each file in the selected folder.

Procedure

  1. In the Actions palette, double-click or drag the Classify Pages action from the Advanced Classifier package.
  2. In the Input file field, provide the default filepath for incoming files for classification:
    • Control Room file
    • Desktop file
    • Variable
  3. In the Classifier field, provide the filepath of the model file. You can either select the .zip folder or extract the .clsproj3 file from this folder and select it.
    • Control Room file
    • Desktop file
    • Variable
  4. Use the Output folder path option to save the classification output documents. The pages from the output document are saved in the respective subfolders based on the categories created in the model file.
    • Desktop folder
    • Variable
  5. In the License field, provide a license credential.
  6. If you select Credential option, click Pick to get a license from the license locker.
  7. Optional: Configure the following:
    Save classification output variable: Save the classification results as a list of dictionaries with the following keys:
    • fileName: Name of file that you want to process along with the index value appended. For example, <<file name_pageIndex>>
    • index: Page number value when you have multiple pages
    • category: The category to which the file belongs to after classification. For example, all HR related documents will be placed in one category.
    • confidence: The threshold percentage value that shows classification such as, which file belongs to which category based on training data.
    Note:
    • You can select the type of classification in the Advanced Classifier:
      • Image-based classification
      • Text-based classification
      • Both image and text based classification
  8. Click Save and Run.

Next steps

You can use each subfolder of similar documents to create and train a learning instance to extract data from the documents.