Create a learning instance in Community Edition

Create a learning instance to begin processing documents. In Community Edition, you can extract data for supported document types and languages using the ABBYY OCR provider.

Procedure

  1. From the Control Room home page, navigate to AI > Document Automation, and click Create Learning Instance.
  2. Enter a name and description for the learning instance.
    Document Automation does not allow duplicate learning instance names, so the name you provide must be unique.
  3. Select an appropriate document type.
    Note: Use the User-defined document type to process documents that are visually similar to invoices, such as purchase orders and sales orders, which contain key-value pairs and a table structure. In this document type, you create and configure all of the form and table fields.
  4. Select the language.
  5. Optional: You can use the Improve accuracy using validation option to send feedback to the system to improve extraction results. For more information, see Improving extraction accuracy through validation.
  6. Optional: Select the Generative AI-driven data extraction option to use the generative AI capabilities for extraction. For more information, see Document Automation - Data extraction using generative AI.

    Generative AI providers provide the following advantages:

    • Efficient processing of large, unstructured documents
    • Can handle documents in both English and other languages
    Select one of the following generative AI providers:
    Note:
    • When you update from a previous release to v.38 or later, Open AI will be set as the default data extraction provider.
    • If you have processed documents using OpenAI and then switched to Anthropic for data extraction, only the documents that will be processed after switching to Anthropic will use Anthropic for data extraction. For the previously processed documents, the data extracted would be using Azure OpenAI.
    • Open AI: Azure OpenAI model is used for data extraction. This provider is available via embedded license (does not require any additional licenses) and bring your own license (BYOL).

      If you are using BYOL, ensure that you configure the additional settings for OpenAI in the extraction bot to use this provider. See Extract data action.

    • Anthropic: You can now use the Anthropic generative AI models available via AWS and GCP for data extraction in Document Automation. This offering provides you the flexibility to select the generative AI model depending on the Cloud provider your company has certified.

      If you are using BYOL, you must configure the Anthropic Claude model on Google Vertex AI or Amazon Bedrock service and then configure the additional settings in the extraction bot to use this provider. See Extract data action.

  7. Click Next.

We recommend that you open a sample document side by side with the Control Room window as you configure the form and table fields.

Note:
  • A form field is a type of field that occurs only one time in a document.
  • A table field is a type of field that reoccurs throughout a document, typically in the form of a table.

  1. Configure the form and table fields for extraction. For more details, see View and search fields.
    1. Click a field to open the fields editor. For more details, see .Guidelines to edit the fields and create custom aliases
    2. Hover over the menu icon to the right of a field to access the up/down arrows.
    3. Use the arrows to rearrange the order of the fields for a more efficient manual validation.
      The order of the fields does not impact extraction.
    To learn more about the other field attributes, see Considerations for form and table fields.
  2. Click Add a field and specify the fields details such as field name, fields label, confidence, data type, format date/number, and so on. For more details, see Considerations for form and table fields.
    Note: If you have selected the Generative AI-driven data extraction option, we recommend that you add good prompts for fields to get the expected results when you create the learning instances. See Document Automation - Data extraction using generative AI.
    The following image shows form and table fields configured in a learning instance:
    Form fields of a learning instance

    Table fields of a learning instance and adding custom table at learning instance level
    Note: The Add a field option is not available for Receipts document type.
  3. Click Create.

Next steps

Upload documents to the learning instance, fix validation errors, and verify the extracted data: Process documents in Community Edition