Document Automation support for Google Custom Document Extractor (CDE)
- Updated: 2024/05/28
Document Automation support for Google Custom Document Extractor (CDE)
In Document Automation, you can create a user-trained learning instance and extract using a Google Custom Document Extractor (CDE) processor.
The new capability can be used to train a model using Google Custom Document Extractor (CDE) for any document type covering 50 languages. Once a model is deployed, the processor URL can be embedded within Document Automation extraction process.
To use Google CDE, you must have:
- A Google subscription to Google Document AI workbench.
- Assigned the Document AI Editor role for creating processors and have created a service account on your Google Cloud Platform. See Create service accounts and IAM roles for Document AI.
- A license for .
Note: When working with API URL trusted
list for Google CDE, you must add all APIs to the trusted list on the Bot Agent machine. The list of allowed APIs for Google CDE
is as follows:
- Google accounts
- Google OAuth
- Google APIS
- Processor end point (only the host to be added to the trusted
list)For example,
https://eu-documentai.googleapis.com/v1/projects/<<Project ID>>/locations/eu/processors/<<Processor ID>>:process
Usage of Google CDE
The effort involved in creating and maintaining models with Google CDE is justified
by various scenarios, including:
- Extended language support: When working with documents that require
support for additional languages, and existing pre-trained models do not
offer that capability, Google CDE becomes essential.
For supported languages, see Language support for Google CDE.
- Unsupported document formats: Google CDE is beneficial when dealing with document types that lack compatible parsers.
- Addressing accuracy and performance challenges: In specific document formats, even with the use of pre-trained models, achieving the desired accuracy can be difficult. Google CDE with specific training on documents can provide better accuracy.
- Custom or non-standard field extraction: Google CDE can be used in scenarios where specific fields need to be extracted from documents that have custom or non-standard formats.
- Extraction based on specific training when labels do not exist: Google CDE is beneficial when there is a need to extract information from fields where pre-defined labels do not exist.