Document pre-processing overview

Our pre-processor package enables you to optimize documents before processing them for data extraction.

This package serves as an initial step in the document processing workflow to prepare documents for effective handling in Document Automation.

The package extracts content such as barcodes, page count, and page content, from documents or processes image files before they are consumed in Document Automation. The pre-processing enhances the overall efficiency and accuracy of document processing, enabling improved data extraction.

Note: Using this package is optional. It is required only when you need to improve the quality of the documents to be processed.

The Pre-processing package provides the following capabilities:

Image processing
  • Concatenate images: Combines two images in a single file.
  • Convert images to PDF: Converts an image file to a text-enabled PDF file.
  • Edit image: Crops or resizes an image file.
  • Enhance image: Adds effects, such as grayscale, blur, and sharpen, to an image file.
  • Orient image: Flips or rotates an image file.
Content extraction
  • Get bar codes: Detects and extracts all barcodes in a document.
  • Get document information: Retrieves document information such as filepath, extension, and number of pages.
  • Page content: Extracts text from a specific page in a document.