Create a learning instance using Google CDE

A learning instance is a structure that holds information such as document type, language, and the fields to be extracted. After creating a custom extraction processor, you must create learning instance to extract data from the documents.

Prerequisites

  • Ensure you have successfully created and trained a Google Custom Document Extractor (CDE) processor.
  • Ensure your Control Room has the Document Workspace (Number of pages) product license.
  • Ensure you configured the BYOK. For more information, see Configure bring your own key BYOK for Google CDE.

To integrate a new processor with Google Document AI, the crucial step is the creation of a learning instance. This involves utilizing the provider as Google Document AI (User-defined) option. By creating a learning instance using this option, users can define form and table fields with matching names as present in the processor.
Note:
  • Currently, Google Document AI supports single table extraction.
  • The check box feature (in preview mode) might result in inconsistent extraction for the check boxes fields, which could lead to inconsistent results. In such cases, if the system is unable to accurately extract the check box field value, it will be labeled as Not Found.

Procedure

  1. From the Control Room home page, navigate to Manage > Learning Instances > Create Learning Instance.
    The Create Learning Instance window opens in a new tab.
  2. Add a name for the new learning instance to be created.
  3. From the Document Type drop-down menu, select User-defined.
  4. From the Provider menu, select Google Document AI (User-defined).
  5. Select Table or Forms field.
  6. Create new fields with identical names as those utilized in the Google CDE processor.
    Note: When creating new fields, ensure that their names match the schema labels used in the Google processor. This applies to both form fields and table fields.
  7. Click Create.

    When a new learning instance is created, the Control Room creates a folder with the same name as the learning instance in the Automation > Document Workspace folder.

  8. Update the extraction bot of learning instance with Service Account and Processor Endpoint URL.
    1. Open the bot for the learning instance from Bots > IQ Bot Processes > {LI name} > {LI name}_extractionbot.
    2. Pick a credential vault locker and key. For more information, see Configure bring your own key BYOK for Google CDE.
    3. Copy the prediction endpoint URL from Google CDE processor.
      Prediction endpoint in Google Document AI
    4. Paste the copied URL in the Document AI endpoint URL for document processor.

      Document AI endpoint URL for document processor

Next steps

Upload documents to the learning instance, fix validation errors, and verify the extracted data. For more information, see Process documents in Document Automation.