Extract data action

Task Bots use the Extract data action to process documents uploaded to Document Automation.

When you create a learning instance in Document Automation, the Control Room automatically creates the extractionbot. To edit this bot, navigate to Automation > Document Workspace Processes, select the folder with the same name as the learning instance, and open extractionbot.

The following table describes the action fields.
Note:
  • We do not recommend changing the variables in these fields as that might break the process.
  • You will see an error when you create a bot with the following actions and execute the bot:
    1. Use the Classify Document action in the Advanced Classifier package to classify a document.
    2. Use the Extract Data action that is using the learning instance of Unstructured document type in the Document Extraction package to extract data from a document.

    Therefore, we recommend not using the Classify Document action and Extract Data action that is using the learning instance of Unstructured document type in the same bot.

  • When you use the Extract Data action along with the IQ Bot Pre-processor, Document Classifier, or OCR actions in a single bot, the bot fails. However, you can use the IQ Bot Pre-processor, Document Classifier, and OCR actions in a single bot.

    Workaround: Ensure that you create separate bots when using any actions from the IQ Bot Pre-processor, Document Classifier, or OCR, and the Extract Data action of the Document Extraction package. If you need to execute these bots in a sequence, include these bots in an Automation Co-Pilot process.

Field Description
Document to extract File path to the uploaded document.
Learning instance name Name of the learning instance associated with this bot.
Output results Specify where to store the Document Automation data. Based on your use case, you can either upload the data to the Document Automation server, or save it to your local folder.
  • Upload to server: Data generated during extraction is uploaded to the server for further processing (such as validation) and later downloaded by a bot running the Download data action.
  • Save to a local folder: Data generated by Document Automation is not sent to the server, but is saved to the specified folder path.
    Note: If you select this option, Document Automation sends files for validation and increments the validation queue. However, you will not be able to view the document in the Automation Co-Pilot validator as there is no associated Automation Co-Pilot request. Also, you can remove the bot running the Download data action from the process, as selecting this option makes that step in the process bot redundant.
Additional settings See Additional settings.
(Optional) Save responses as record Select one of the following tabs for the destination record variable:
  • Multiple variables: To store the output in multiple variables by providing the key and the variable to which the key is mapped. This variable can be one of the following types: String, Number, Datetime, Boolean, and so on. For example, if your source record variable contains two entries; name and contact number, you can store the output as given below:
    Key Map to variable
    Name StrName
    Contact number MobileNo

    The variables StrName and MobileNo are String and Number type variables respectively.

  • Record: To store the output in a record-type variable. Click the drop-down menu to select an existing variable or create one.

Additional settings

The following table lists all the additional settings such as default, Google Document AI, Microsoft OpenAI, Anthropic, and IQ Bot in the Extract data action. You must enter the license credentials for the selected settings for external services.

Additional settings Description
None This is the default option for your extractionbot. When you do not want to use external connections, you can select the None option.
Google DocAI
  • Service account: Enter the license credentials that contains your Google Document AI security token. If you do not want to use your own credentials, select the None option. Use the Credential, Variable, or Insecure string option to enter the service account credentials.
  • Endpoint URL for document processor: Provide the URL for your service account.
  • Cloud Storage bucket name (optional): Provide the Cloud storage bucket name. If you have a document that contains more than 10 pages, input file and extraction results will be stored temporarily in this bucket.
MS OpenAI
  • Service account for GPT: Use the Credential, Variable, or Insecure string option to enter the license credentials that contains your AI security token for query. If you do not want to use your own credentials, select the None option.
  • Endpoint URL for GPT model: Provide an URL to a document processing endpoint. For example, https://{your-resource-name}.openai.azure.com/openai/deployments/{deployment-id}/chat/completions?api-version={api-version}
  • Service account for embeddings: Use the Credential, Variable, or Insecure string option to enter the license credentials that contains your Microsoft OpenAI security token for embedding. You can also use your own credentials.
  • Endpoint URL for ADA model: Provide an URL to a document processing endpoint. For example, https://{your-resource-name}.openai.azure.com/openai/deployments/{deployment-id}/embeddings?api-version={api-version}.
Anthropic See Extract data using Anthropic models.
IQ Bot (Optional) Group Label (optional): If the learning instance was created in Automation 360 IQ Bot and connected to Document Automation, this field is auto-filled (variable) with the applicable document group name.