Cloud Extraction Service architecture
- Updated: 2025/11/17
The Cloud Extraction Service architecture outlines the document processing data flow, detailing the stages of uploading, extracting, and downloading data using various components and third-party services.
The following Cloud Extraction Service architecture provides a high-level overview of the different phases involved in the document processing data flow.
- Bring your own key (BYOK) is not supported in Cloud Extraction Service.
- For information about how documents are stored in Document Automation, see Document Automation security FAQ.
The following image shows an overview of the different components used in Cloud Extraction Service:

- Customer network
- The customer network is where the data extraction process is initiated.
- Input: This process involves uploading the documents that need to be processed for data extraction.
- Upload bot: This process involves uploading the documents to the Control Room Cloud storage services.
- Download bot: This process involves downloading the extracted information from the documents.
- Output: This process involves storing the extracted information.
- Automation Anywhere Cloud
- Control Room (Cloud only): Orchestrates the data extraction process by acknowledging requests from Bot Runner device.
- Third-party Cloud services
- Google Vision OCR: This process involves converting documents into machine-readable format and the documents are processed for OCR on Google Cloud.
Data extraction using generative AI providers
The following image shows the end-to-end data flow through different components for generative AI providers:
The following sections represent different stages of data flow through different components when using generative AI providers:
- Stage 1: Uploading files to the Control Room
-

The user uploads files to the Control Room or a scheduler bot uploads the file from a shared location. The files are temporarily uploaded to the Control Room storage services.
- Stage 2: Document extraction process
-

The Control Room initiates the data extraction process using either the Automation Anywhere pre-trained models or third-party Cloud extraction services.
- Automation Anywhere pre-trained models: Data extraction is processed using Cloud Extraction Service.
- Document Automation subscriptions: Data extraction requests are sent and received from the third-party Cloud extraction services via the Automation Anywhere proxy gateway.
- Stage 3: Downloading output
-

The data extraction results are downloaded on the network path defined by the user as CSV or JSON. Customers typically create bots to upload this information to downstream applications or system of records.
Data extraction using Microsoft Azure AI Document Intelligence
The following image shows the end-to-end data flow through different components for Microsoft Azure AI Document Intelligence:
- Stage 1: Uploading files and fetching configuration details
-

The user uploads files to the Control Room or a scheduler bot uploads the file from a shared location. The files are temporarily uploaded to the Control Room storage services.
- Stage 2: Data extraction process
-

The Control Room initiates the OCR and data extraction process using the Cloud Extraction Service. Data extraction requests are sent and received directly from Microsoft Azure AI Document Intelligence services for Document Automation subscriptions. The data extraction results are sent to the Control Room.
- Stage 3: Downloading output
-

The data extraction results are downloaded on the network path defined by the user as CSV or JSON. Customers typically create bots to upload this information to downstream applications or system of records.