Create a learning instance in Document Automation

Begin processing documents by creating a learning instance to extract data from invoices, utility bills, or receipts. A learning instance is a structure that holds information such as document type, language, and the fields to be extracted.

Prerequisites

  • To create a learning instance, you must be a Learning instance creator user. See Document Automation users.
  • The default OCR is ABBYY FineReader Engine. Alternatively, Cloud Control Room users can create a learning instance to process documents in Google Vision OCR.

Watch this video for the complete end-to-end process of creating a learning instance:

Procedure

  1. From the Control Room home page, navigate to Manage > Learning Instances > Create Learning Instance.
  2. Enter a name and description for the learning instance.
    Document Automation does not allow duplicate learning instance names, so the name you provide must be unique.
  3. Select the document type: Invoice , User-defined, Utility Bill, or Receipt
    Use the user-defined document type to process documents that are visually similar to invoices, such as purchase orders and sales orders, which contain key-value pairs and a table structure. In this document type, you create and configure all of the form and table fields.
  4. Select the language.
    Document Automation supports English, Dutch, French, German, Italian, Portuguese (Brazilian), and Spanish. For more details, see Languages supported in Document Automation.
    Note: Extraction for German language documents is currently in preview. Extraction results will improve in future releases.

    If you select a document type that is used while configuring the parser in step 3, the language selected during parser configuration is auto-selected. In addition, the locale list displays language options based on the auto-selected language.

  5. If you selected Invoice: Select the provider.
    If you selected the English language in step 4, Automation Anywhere (Pre-trained) is auto-selected.

    If you select a document type that is used while configuring the parser in step 3, the configured (third-party) parser is auto-selected as the provider.

  6. Optional: Select the OCR provider. By default, Document Automation processes documents in ABBYY FineReader Engine.
    Users with a Cloud Control Room can select to process documents in Google Vision OCR.
  7. Optional: You can use the Improve accuracy using validation option to send feedback to the system to improve extraction results. For more information, see Improving extraction accuracy through validation.
    Note: The Improve accuracy using validation option is not available for Utility bills and Receipts document types.
  8. Optional: Select the Generative AI-driven data extraction option to use the generative AI capabilities for extraction. For more information, see Document Automation data extraction using Generative AI.
    Creating a learning instance in Document Automation
  9. Click Next.

We recommend that you open a sample document side by side with the Control Room window as you configure the form and table fields.

Note:
  • A form field is a type of field that occurs only one time in a document.
  • A table field is a type of field that reoccurs throughout a document, typically in the form of a table.

  1. Configure the form and table fields for extraction. For more details, see View and search fields.
    1. Click a field to open the fields editor. For more details, see .Guidelines to edit the fields and create custom aliases
    2. Hover over the menu icon to the right of a field to access the up/down arrows.
    3. Use the arrows to rearrange the order of the fields for a more efficient manual validation.
      The order of the fields does not impact extraction.
    To learn more about the other field attributes, see Considerations for form and table fields.
  2. Click Add a field and specify the fields details such as field name, fields label, confidence, data type, format date/number, and so on. For more details, see Considerations for form and table fields.
    The following image shows form and table fields configured in a learning instance:
    Form fields of a learning instance

    Table fields of a learning instance
  3. Click Create.
When a new learning instance is created, the Control Room creates a folder with the same name as the learning instance in the Automation > Document Workspace folder. The folder contains two bots (extraction and download), a process, and a form. For more details, see Bots output file and folder structure.

Next steps

Upload documents to the learning instance, fix validation errors, and verify the extracted data: Process documents in Document Automation