Architecture and security guidelines for Document Automation with Generative AI capability
- Updated: 2024/10/16
Architecture and security guidelines for Document Automation with Generative AI capability
Here's a list of privacy, security, and typical use cases for your consideration for using Document Automation with Generative AI capability.
Functionality
- What is different about the manner in which Document Automation processes unstructured documents and shipment documents compared to invoices?
- Automation Anywhere incorporates (individually modeled) large language models (LLM) in the Document Automation product to assist with processing unstructured documents and shipment documents.
- What document types can be processed using Generative AI?
- Any unstructured and semi-structured documents, including pre-trained document types such as: invoices, bill of lading, waybill, arrival notice, or packing list.
- Are there any limitations on the field types supported by Document Automation?
- No, we support form and table fields with GenAI capability.
- What are the supported languages?
- Officially English is supported, but other languages would also work.
- What is the pricing structure for Document Automation with Generative AI capabilities?
- Automation Anywhere charges per page for Document Automation, and the OpenAI and Anthropic cost is included in the price.
- Could a customer call their own endpoint for LLMs?
- Yes, we support bring your own license (BYOL) for Microsoft Azure OpenAI and Anthropic Claude, and a customer-defined LLM endpoint use case is supported.
- Is the Generative AI with Document Automation feature available On-Premises in a customer’s private cloud?
- Yes, integrated generative AI is now available for use for the On-Premises and Cloud versions of Document Automation.
- What OCR engine can be used for unstructured document types?
- Currently we support Google Vision OCR and ABBYY OCR.
- What OCR engine can be used for shipment documents?
- We recommend using ABBYY OCR and Google Vision OCR for shipment documents.
Security
For security FAQs, see Document Automation security FAQ.
Architecture diagram
- Data extraction using ABBYY OCR and generative AI providers
-
The extraction process consists of several steps and is applicable to Cloud and On-Premises deployments:
- Extraction process starts by running the Document Extraction package on a Bot Runner device. Configuration of the extraction process is defined in a learning instance.
- Documents are processed using ABBYY OCR that is deployed on the Bot Runner device.
- To process large documents, the package creates embeddings for different chunks of the document.
- The package identifies the most relevant chunk of the document for
the provided search query using embedding and sends that chunk along
with a prompt to the model through the proxy gateway.
Finally, the package receives responses from the model and converts them into document extraction results.
- Data extraction using Google Vision OCR and generative AI providers
-
The extraction process consists of several steps and is applicable to Cloud and On-Premises deployments:
- Extraction process starts by running the Document Extraction package on a Bot Runner device. Configuration of the extraction process is defined in a learning instance.
- The Document Extraction package sends documents to proxy gateway.
- Proxy gateway forwards this request to the Google Vision API endpoint for OCR and sends the results back to the package.
- To process large documents, the package creates embeddings for different chunks of the document.
- The package identifies the most relevant chunk of the document for
the provided search query using embedding and sends that chunk along
with a prompt to the model through the proxy gateway.
Finally, the package receives responses from the model and converts them into document extraction results.
For Automation Anywhere Cloud Generative AI data security information, see: Data security for generative AI - FAQ