Using the Extract action for Google Document AI

Configure the Extract action to enable your bot to send documents to Google Document AI for data extraction and retrieve the output in JSON format.

Prerequisites

Login to your Google Cloud account and go to the Processors page to retrieve your custom endpoint. See Use your processor endpoint.

Your custom endpoint should follow this format: https://LOCATION-documentai.googleapis.com/API_VERSION/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID . You will need the parameters in bold to configure this action.

Procedure

  1. Double-click or drag the Google Document AI > Extract action.
  2. In the Document file path field, provide the file path to the document you want to process.
    Note: If using this action within a Loop action to process all the documents in a folder, be sure to include a period between the variable holding the file name and the one holding the extension. For example, C:\Documents\$dictFile(name)$.$dictFile(extension)$.
  3. Provide the following information, which is found in your custom endpoint.
    • Project id
    • Processor id
    • Location
  4. In the Session name field, enter the name of the session you used to connect to the Google service account in the Connect action.
  5. Optional: Select or create a string variable to hold the output.
    The action returns data in JSON format.
  6. Click Save.

Next steps

You can use the actions in the JSON package to parse the data and extract values from specific nodes. For an overview of how to do this, refer to the following steps:
  1. Initiate the JSON session with the Start session action. In the JSON text field, insert the string variable holding the output of the Extract action.
  2. Use the Get node value action to parse the output of the Google Document AI > Extract action and assign the node values to a list variable.

    You can insert a Loop action after the Get node value action to iterate through each list item to perform an operation on each node value.

  3. Terminate the JSON session with the End session action.