Advanced Classifier overview

The Advanced Classifier package enables you to classify documents into a folder structure from where you can upload the documents to the respective learning instance for extraction in Document Automation.

Use this package to perform the following tasks:

  • Group pages from a document file into different folders based on the layout, content, or both.
  • Group documents from a document file into different folders based on the layout, content, or both from the first page.
  • Split pages from a document file into different folders based on the selected classification model.

Using this package is optional and is required only when you have different documents that you need to group at a document level or page level, or split sets of pages into separate folders.

Note: This package uses Skilja, a third-party classification service and requires a separate license from Skilja to operate.

Capabilities

The Advanced Classifier package provides the following capabilities:

Train model
Enables you to create a model and train the model to classify documents and pages, and split pages from documents.
Classify documents
Enables you to classify document files based on the layout, content, or both from the first page on each document into separate folders.
Classify pages
Enables you to separate pages from a document into separate folders and filter out pages based on the layout, content, or both.
Split document
Enables you to split pages from a document into different folders based on the selected classification model. The classification model is trained to split the pages of documents using samples of different document patterns and analyzing the first, middle, and last pages of the documents.

Splitting documents and classifying documents and pages using Advanced Classifier

When is Advanced Classifier necessary beyond Document Classifier?

Use the Advanced Classifier package in the following scenarios:

  • When you need to classify documents or pages based on predefined rules. For example, if a healthcare organization wants to classify a document that includes invoices and purchase orders, the Advanced Classifier uses predefined rules such as keywords to classify the document. The Advanced Classifier identifies keywords such as invoice, bill, or receipt in the document to classify invoices and keywords such as purchase order, PO, or order in the document to classify purchase orders.
  • When you need to split pages from a document based on specific patterns. For example, if a financial firm wants to classify a document that includes invoices, bank statements, tax forms, and receipts, the Advanced Classifier uses patterns such as layout and content to classify the document and separate out the pages of each type.

To understand the differences between Advanced Classifier and Document Classifier, see Comparing Advanced Classifier and Document Classifier.

A valid Skilja (third-party provider) license is mandatory to use the Advanced Classifier package and its actions to process documents. Contact contact@scalehub.com to obtain the license details. For more information, see Connect your Advanced Classifier license.

You must download the Advanced Classifier package and upload the package to the Control Room.