The Cloud Extraction Service architecture outlines the document processing data flow, detailing the stages of uploading, extracting, and downloading data using various components and third-party services.

The following Cloud Extraction Service architecture provides a high-level overview of the different phases involved in the document processing data flow.

Note:
  • Bring your own key (BYOK) is not supported in Cloud Extraction Service.
  • For information about how documents are stored in Document Automation, see Document Automation security FAQ.

The following image shows an overview of the different components used in Cloud Extraction Service:

Diagram of document processing workflow when using the Cloud Extraction Service feature
Customer network
The customer network is where the data extraction process is initiated.
  • Input: This process involves uploading the documents that need to be processed for data extraction.
  • Upload bot: This process involves uploading the documents to the Control Room Cloud storage services.
  • Download bot: This process involves downloading the extracted information from the documents.
  • Output: This process involves storing the extracted information.
Automation Anywhere Cloud
Control Room (Cloud only): Orchestrates the data extraction process by acknowledging requests from Bot Runner device.
Cloud Extraction Service: This process involves extracting information from documents by sending requests to OCR and third-party Cloud services. All data extraction requests are sent and received via Cloud Extraction Service.
Third-party Cloud services
Google Vision OCR: This process involves converting documents into machine-readable format and the documents are processed for OCR on Google Cloud.
LLM providers: This process involves data extraction using third-party generative AI models.

Data extraction using generative AI providers

The following image shows the end-to-end data flow through different components for generative AI providers:

Data flow diagram for generative AI providers when using the Cloud Extraction Service feature

The following sections represent different stages of data flow through different components when using generative AI providers:

Stage 1: Uploading files to the Control Room

Data flow diagram showing file upload to Control Room

The user uploads files to the Control Room or a scheduler bot uploads the file from a shared location. The files are temporarily uploaded to the Control Room storage services.

Stage 2: Document extraction process

Data flow diagram of document extraction process

The Control Room initiates the data extraction process using either the Automation Anywhere pre-trained models or third-party Cloud extraction services.

  • Automation Anywhere pre-trained models: Data extraction is processed using Cloud Extraction Service.
  • Document Automation subscriptions: Data extraction requests are sent and received from the third-party Cloud extraction services via the Automation Anywhere proxy gateway.
Stage 3: Downloading output

Data flow diagram of downloading output

The data extraction results are downloaded on the network path defined by the user as CSV or JSON. Customers typically create bots to upload this information to downstream applications or system of records.

Data extraction using Microsoft Azure AI Document Intelligence

The following image shows the end-to-end data flow through different components for Microsoft Azure AI Document Intelligence:

Data flow diagram for Microsoft Azure AI Document Intelligence when using the Cloud Extraction Service feature

Stage 1: Uploading files and fetching configuration details

Data flow diagram showing file upload to the Control Room

The user uploads files to the Control Room or a scheduler bot uploads the file from a shared location. The files are temporarily uploaded to the Control Room storage services.

Stage 2: Data extraction process

Data flow diagram of document extraction process using Microsoft Azure AI Document Intelligence services

The Control Room initiates the OCR and data extraction process using the Cloud Extraction Service. Data extraction requests are sent and received directly from Microsoft Azure AI Document Intelligence services for Document Automation subscriptions. The data extraction results are sent to the Control Room.

Stage 3: Downloading output

Data flow diagram of downloading output

The data extraction results are downloaded on the network path defined by the user as CSV or JSON. Customers typically create bots to upload this information to downstream applications or system of records.