Using pre-trained document types

A pre-trained document type is a model that has already been trained on a large dataset of similar documents, such as invoices, arrival notices, and bills of lading.

Overview

The pre-training is done either in-house or by third-party providers, so customers do not need to do it themselves. These document types are designed to extract key-value pairs, data from tables, and unstructured information from documents of the same or similar types. Pre-trained document types, or pre-trained models, come with a set of predefined fields that users can select and customize when creating a learning instance.

Use pre-trained document types to achieve the following:

Rapid deployment
Quickly implement document extraction processes by saving time in creating, training, and deploying custom models.
Improved accuracy
As these document types are trained on large document sets, they provide higher accuracy compared with custom document types.

Pre-trained document types are supplied by extraction providers. An extraction provider is a service that specializes in processing specific document types and extracting data from documents based on predefined rules or models.

Automation Anywhere
This extraction service is developed in-house and trained to extract data from documents such as invoices, arrival notices, bill of lading, and documents of similar types. These document types can optionally connect to generative AI services such as Azure OpenAI or Anthropic to further boost the model’s capabilities for extracting data.
Google Document AI
This extraction service is developed by Google and offers pre-trained parsers to extract data from documents such as invoices, receipts, and utility bills. Integrating pre-trained document parsers from Google Document AI in Document Automation allows users to leverage advanced, ready-to-use document processing capabilities.

Support matrix

The following table provides the pre-trained document types supported in Document Automation.

Document type Extraction provider Generative AI provider
Invoices Automation Anywhere Yes
Google Document AI No
Arrival Notice Automation Anywhere Yes*
Bill of Lading Automation Anywhere Yes*
Packing List Automation Anywhere Yes*
Receipts Google Document AI No
Utility Bill Google Document AI No
Waybill Automation Anywhere Yes*

*The generative AI provider option is enabled by default and cannot be disabled for this document type.

Note:

Note: If you do not find the document type that you want to use, you can use the User-defined document type to support your use case. See Document types: support matrix.