Document Classifier overview

The Document Classifier package groups or classifies documents into appropriate learning instances for content extraction

Use this package to perform the following tasks:

  • Group pages from a document file into different folders based on the layout, content, or both.
  • Group documents from a document file into different folders based on the layout, content, or both from the first page.

Using this package is optional and is required only when you have different documents that you need to group at a document level or page level into separate folders.

Note: Classifier (Number of pages) license is required to process documents using this package.

The Document Classifier package provides the following capabilities:

Train model
Enables you to create a model and train the model to classify documents and pages.
Classify documents
Enables you to classify document files based on the layout, content, or both from the first page on each document into separate folders.
Classify pages
Enables you to separate pages from a document into separate folders and filter out pages based on the layout, content, or both.

To understand the differences between Advanced Classifier and Document Classifier, see Comparing Advanced Classifier and Document Classifier.