Automation 360

Add Document Extraction Task to a process automation

Download as PDF

Add Document Extraction Task to a process automation

Download as PDF

Updated: 2025/12/12

As a Citizen Developer or Pro Developer, you can add a Document Extraction task to any process to extract data from documents. Extraction can be done using Task bots (via bot runners) or Automation Anywhere Cloud Extraction service.

You can configure the task using Process Composer to customize and control your Document Automation workflows. See Automation Co-Pilot for Business Users process in Document Automation for details.

Procedure

Drag the Document Extraction task from the Element panel into your process.
In the Document Extraction panel, configure:
1. Element ID (for example, DocumentExtraction).
2. Task name (for example, $input[InputFileName]$ - displayed as reference in the UI.
3. You have two options to process documents. Choose a Source selection:
  - Option A:Task bot (default) to extract documents using bot runners.
  - Option B:Cloud extraction to extract documents using the Automation Anywhere Cloud Extraction service.
  Option A: Process Document using Task bot
  1. Select Task bot.
    Note: The Task bot must have the Extract Data action from the Document Extraction package.
  2. (Optional) Click Preview bot before implementation.
  3. Set the Queue timeout ( 1 minute to 24 hours) to avoid delays and prevent the bot from stalling the process if the bot does not start within the specified time frame.
  4. Check Input values and assign variables. The input fields are set by the chosen Task bot. But, if you picked the pre-made Document Automation extraction bot, these are the expected input fields (this list can change depending on the version when the learning instance was made):
    - InputFilePath: $inputFile (desktop path or a File object that is passed through the Create Request action [recommended].)
    - LearningInstanceName: $input[LearningInstancename]$ (In this example, it is the name of the learning instance used when Document Automation generates a bot automatically.)
    - Version: $input[Version]$ (Optional variable; in this example, it is used to pass a learning instance version used in Test mode.
    - ReferenceID: $CopilotRefId$ (Optional variable; in this example, it is the ID used to track the document extraction results across versions used when Document Automation generates a bot automatically.)
  5. As a Citizen Developer or Pro Developer, you can select how the Task bot executes; locally on the request creator's desktop or remotely.
    From the Bot Task execution mode drop-down, you can select the following modes:
    
    Remote execution (default): Local bot runs remotely based on your Global/Process Scheduler settings and generates a corresponding entry in the Audit log. With remote execution, automations can run independently while users work on other tasks and are notified in Automation Co-Pilot when the automations have completed.
    
    Local execution (main window): Local bot executes on the main window of the request creator's device and generates a corresponding entry in the Audit log. The request must be created by a user with an attended license and have selected a default device; otherwise, the Bot Task execution mode defaults to Remote execution. Local execution enables sensitive data to remain On-Premises and incur no queue times on local desktops.
    
    Local execution (child window): Local bot executes on a child window of the request creator's device and generates a corresponding entry in the Audit log. The request must be created by a user with an attended license and have selected a default device; otherwise, the Bot Task execution mode defaults to Remote execution. Local execution enables sensitive data to remain On-Premises and incur no queue times on local desktops. Allows use of main desktop during execution.
  Option B: Process Document using Automation Anywhere Cloud Extraction Service
  1. Select Cloud extraction to extract documents using the Automation Anywhere Cloud Extraction service. See Automation Anywhere Cloud Service for more details.
  2. Enter the Input file. For example, we recommend that you enter a File object reference ( $InputFile$ ) that is passed through a Create Request action (amp;ProcessRequest{input}{InputFile}$).
  3. Enter the Learning Instance name. This is name of the learning instance in Document Automation, for example (amp;ProcessRequest{input}{LearningInstanceName}$).
  4. Learning Instance version. This field is usually filled in automatically by Document Automation. We do not recommend changing it. If you are not sure, just leave it empty.
  5. Document Extraction package version. You do not need to set up anything to use the latest version. But if you wanted to work with data using an older version, you can enter that version.
    
    Note: If you do not set it up, the newest Test Mode version and the newest Document Extraction package version will be used automatically.
In the Data privacy tag field (optional) , add a string or variable to tag sensitive output as hidden.

Click Save to finish.

You have now configured the Document Extraction task in your process automation. After the task is done:

If you use auto-generated process and Task Bots, the output variable is set to ExtractionBotOutput; therefore, you can add the reference amp;DocumentExtraction{output}{ExtractionBotOutput}{Status}$In in the Status field in such cases.

If you used Cloud extraction for document extraction, the task will close when the event is finished. To reference the output from the Document Extraction task, you should use the output name ExtractionBotOutput, which is the same name as the auto-generated Task Bot. Therefore, the complete reference will be


amp;DocumentExtraction{output}{ExtractionBotOutput}{Status}$

provided the task name is DocumentExtraction. The Document Extraction task includes these output fields:


Output Field	Description	Possible Values
`DocumentID`	Unique ID for the processed document	N/A
`Status`	Current status of the document	`DW_EXTRACT_SUCCESS`: Document extraction task is completed `DW_EXTRACT_FAILURE`: Document extraction task has failed `DW_EXTRACT_VALIDATION`: Document extraction task completed successfully, but the document contains validation errors.
`StatusCode`	Status result after execution	N/A
`StatusMessage`	Explanation of the status code	N/A
`ErrorMessage`	Description of failure reason	N/A
`ErrorModule`	Indicates the provider for which the error occurred	Possible options are: Native V8 DocAI Classic(IQBot) StandardForm

Customer Use Case: Invoice Processing Automation with Cloud Document Extraction

This use case is for the Acme Manufacturing Corporation.

Challenge: Manual invoice data entry from hundreds of daily vendor submissions was error-prone, slow, and resource-intensive.
Business Goal: Automate the extraction of invoice data using Automation Anywhere's Cloud Extraction service, improving accuracy and reducing turnaround time.

Solution Overview and Workflow:

Acme implemented a document automation workflow in Automation 360 using the Cloud Extraction service to extract and process invoice data in real time.

Key components:

Vendor invoices submitted via email or upload portal.
Cloud-based extraction of structured/unstructured data.
Seamless integration into the company’s ERP system (SAP).

Document extraction use case

Trigger: Vendor uploads an invoice (PDF) to a secure portal.
Bot Initiation: Process automation as defined in Process Composer is triggered.
Cloud Extraction Task:
- The automation includes a Document Extraction task using Cloud extraction.
- The input file is passed as a File object ( $InputFile$ ).
- The system references the trained Learning Instance called Invoice_AI_Model and
Cloud Processing:
- The document is uploaded to the Automation Anywhere Cloud.
- AI extracts invoice fields: Invoice Number, Vendor Name, PO Number, Line Items, Amount, and Due Date.
Validation (Optional):
If configured, extracted data is routed to a human validator for low-confidence entries.
Integration: Upon validation or auto-approval, structured data is pushed to SAP using an API integration.
Audit and Notification:
- Output variables like DocumentID, Status, and StatusMessage are logged.
- Finance team receives an Automation Co-Pilot notification with summary and exceptions (if any).

The following table shows the Business Impact and Metrics for Acme Manufacturing Corp.:


Metric	Before Automation	After Cloud Extraction
Average Invoice Handling Time	10 minutes	1.5 minutes
Data Entry Errors	~5%	<0.2%
Monthly Cost	$8,000 (manual labor)	$1,200 (Bot+Cloud cost)

Key Benefits

No infrastructure needed: Extraction handled securely in the cloud.
AI-powered accuracy: Consistent extraction across varied invoice formats.
Scalable & flexible: Handles peak loads (such as, end-of-month rush).
Audit-ready: Full tracking of status, messages, and errors.