Process documents in Community Edition

Upload sample invoices to train the learning instance, verify the extracted data, and fix validation errors.

Prerequisites

Upload sample invoices to a learning instance to test the learning instance's data extraction capabilities. If a document requires manual validation, the system sends it to the Validator, where you must manually enter the correct data.

Procedure

  1. Upload documents to the learning instance:
    Note: Community Edition can process a maximum of five documents at a time. You must wait for a document to successfully complete processing (and validation if necessary) to upload another document.
    1. Click Process documents.
    2. In the Process Documents window, click Browse to select the files to upload.
    3. In the Download data to field, enter the file path where the extracted data is output to a CSV file.
    4. Click Process documents.
      The Bot Runner window appears. The window disappears when the documents are done processing. Refresh the Learning instances table to see the updated metrics.

If there is a value next to the Validate documents link, you must manually validate the fields with errors. Otherwise, proceed to step 2b to review the extracted data.

  1. Fix the validation errors
    1. Click Validate documents.
      The Automation Co-Pilot Task Manager opens in a new tab, with the first failed document in queue. For an introduction to the Validator user interface, see Using the Automation Co-Pilot Task Manager Validator for Document Automation.
    2. Review each field to verify the data type and extracted value.
      Document Automation supports the following data types: text, number, date, address, and check box
      Alternatively, from the drop-down list on the right panel, you can select Show fields that need validation.
      Note: When documents are awaiting validation, if you edit the learning instance, click Reprocess to reattempt extraction.

      Reprocessing documents does not affect the uploaded documents metric.

    3. Update the fields with errors.
      Click the field or draw a box around the values that you want to extract.
      For Automation Anywhere pre-trained models, you can configure the learning instance to extract specific values in a field and ignore others. For more information, see Use validation feedback to extract specific values in a table.
      • To skip a document without correcting errors, click Skip to proceed to the next document in the validation queue.
      • To remove a document that cannot be processed, click Mark as Invalid.
    4. After you make the necessary corrections, click Submit so that the document can finish processing.
      The next document in queue appears. When all the documents are corrected, the system displays a message stating that no more tasks are available.
    5. Close the tab to return to the Learning Instances page.
  2. Verify the output results:
    1. Open the file in the Success folder that contains the extracted data and review the results to ensure that it matches your use case.
      The Microsoft forms return extracted values (OCR data) in the JSON format, such as GUID_0-MSFormTableResult.json. Along with the extracted document data in <<GUID>>_FileName CSV file, the Success folder also shows the extracted table data separately in another CSV files. Based on the number of tables in the document, you can find different CSV files for each table. For example, <<GUID_PAGE_NUMBER-Table_FILENAME_PAGENUMBER_TABLENUMBER.

      With separate table data, you can compare extracted data with Microsoft engine data in the GUID_0-MSFormTableResult.json file.

    2. Optional: Review the Learning Instance dashboard.
      The dashboard displays the total number of uploaded documents and the number of documents pending validation.

Next steps

Congrats! You have now successfully processed your first documents in the Community Edition version of Document Automation.