Leia e revise a documentação do Automation Anywhere

Automation 360

Fechar conteúdo

Contents (Conteúdo)

Abrir conteúdo

Data extraction in Document Automation

  • Atualizado: 2022/06/23
    • Automation 360 v.x
    • IQ Bot
    • Digitize

Data extraction in Document Automation

Understand how documents are processed in Document Automation.

Improving extraction accuracy through validation

When a learning instance is created, the user has the option to enable this feature to send feedback to the learning instance based on user-provided changes in the Validator. Whereas in Automation 360 IQ Bot, a learning instance required extensive training and testing before it could run autonomously in production mode, in Document Automation, learning instances running in production mode can continuously "learn" whenever a user resizes or relocates the extraction region in the Validator.
Nota: This feature is available only for Automation Anywhere models.

The following graphic provides a visual overview of the process by which learning instances continuously receive feedback from validation:

Process of "teaching" learning instances through validation feedback

  1. An uploaded document passes through the extraction engine.
  2. If the learning instance successfully extracts the data, the document is added to the straight-through processing (STP) count and the extracted values are downloaded to a file in the Success folder.

    If the learning instance can not extract the data, the system evaluates whether the document contains an unfamiliar layout.

  3. If the learning instance does not recognize the document layout (new layout), the document is sent for manual validation where the user "teaches" the learning instance how to extract the data by setting the extraction region.
  4. The extracted values are downloaded to a file in the Success folder and the changes are collected in a feedback file, which is sent to the feedback database.
    • Feedback is only collected when the user changes the extraction region. If the user manually inputs text, the system does not collect feedback.
    • The feedback file only contains data on the field location to improve extraction accuracy for subsequent documents.

    If the learning instance recognizes the cluster, it retrieves previous feedback from the feedback database and uses it to extract data.

How Document Automation identifies new layouts

Document Automation extraction is based on object detection. During document processing, the extraction engine identifies objects, or key-value pairs of the field and associated value. The engine creates a "fingerprint" of the document, which stores the sequence of the objects and each object's location in the document.

When a document is processed, if the engine recognizes the keys and their locations, the document is classified and extracted based on that existing fingerprint. Otherwise, the engine saves a new fingerprint of the keys and their locations.

Process by which the engine either recognizes the existing fingerprint in a document or creates a new fingerprint

Send Feedback (Enviar Feedback)