Create learning instance with generative AI for unstructured documents

Use this topic as a guide to create a learning instance leveraging the generative AI (GenAI) capability to extract data from unstructured documents such as Contracts, Agreements, Reports, Letters, and Emails.

With generative AI, you can create a learning instance to extract data from unstructured documents without training the learning instance. This is critical for fast and accurate document processing. Let’s walk you through the steps of creating a learning instance with the generative AI capability that will enable accurate data extraction from unstructured documents.

Prerequisites

A Professional Developer of a company would perform the following tasks:
  • Create, edit, and delete learning instances
  • Upload documents for processing and testing
  • Check-in and check-out learning instances from private to public folders

License requirement: Bot Creator license to perform the above tasks.

Assigned roles and permission:
  • AAE_IQBot Services or AAE_IQBot Admin
  • AAE_Basic

Procedure

  1. Log in to the Control Room and navigate to Manage > Learning Instances and click the Create Learning Instance button to start creating a new learning instance.
  2. Next, enter a unique learning instance name to identify it easily in the Learning Instances list, and proceed to select the other options as follows:
    Create a learning instance for unstructured document with the generative AI capability
    1. Description (optional): This is an optional field that can be used to add a meaningful description and summarize the use of the learning instance.
    2. Document Type: Unstructured document
      On selecting this option, the generative AI driven data extraction feature is enabled. This is enabled by default for unstructured document types.
    3. Language: English
      Currently, we support English language only.
    4. Locale: as per locale of the documents.
      The locale is selected based on your language and the country where the document originates from.
    5. Provider: Automation Anywhere (User-defined)
      This value is selected by default as we currently offer this option only.
    6. OCR Provider: Google Vision OCR or ABBYY OCR
      Support for ABBYY OCR is new from release v32.
  3. Click Next to begin creating form and table fields for the learning instance.
    For details on creating form and table fields, see: Create a learning instance in Document Automation, steps 10-12.
  4. Click Add a field to begin adding fields for each data point or entity for which you want to extract data, from your documents.
  5. Next, add a Field name which must be specific to the data point you want to extract, a Field label which is used to create a default search query, and select Data type to define the field value data structure.
    You can select from Text, Number, Date, or Address Data type value options from the drop-down.
  6. When leveraging the generative AI capability, the Confidence field is grayed out. The field can be set to Required or Optional.
  7. For the Search query for generative AI model section, you have the option to go with the system-generated query or add a custom query.
    For example, for a total cost field, the default generative AI query would say ‘What is the total cost?’. You can customize the query to ‘What is the total cost? Extract the number without the currency'. This extracts the total cost without the currency information.
    Create Table fields for a learning instance in Document Automation with generative AI capability
  8. In the next step, define the Field Rules and Document Rules for the form and table fields.
    See Create a learning instance in Document Automation and References for creating a learning instance in Document Automation for details on creating table fields, adding Field Rules and Document Rules.
  9. From v32, you can define multiple tables during the process of defining Table fields. Based on your use case, you can define additional tables by clicking the Add a table icon next to the table field drop-down.
  10. Click Create to complete creating the learning instance.

Next steps

  1. Publish the learning instance to the public repository so that the learning instance can be used in public mode to extract data from real documents, and validators can manually validate documents. See Publish the learning instance to production.
  2. In the Manage > Learning Instances list page, identify the learning instance you just created and published and click Process to begin uploading documents for processing and data extraction. See Process documents in Document Automation.
  3. Open the CSV document with the extracted data to compare with the processed document to validate and confirm that the Generative AI enabled search query fields has extracted data with high accuracy.