Document Extraction package updates
- Updated: 2024/04/18
Document Extraction package updates
Review the updates in released versions of the Document Extraction packagesuch as new and enhanced features as well as fixes and limitations. The page also lists the release dates of each version, and the compatible Control Room and Bot Agent versions.
Versions summary
The following table lists the versions of Document Extraction package released either with an Automation 360 release or as a package-only release (in descending order of release dates). Click the version link for information about updates in that package version.Version | Release date | Release type | Bot Agent version | Control Room build |
---|---|---|---|---|
3.32.26 | 18 April 2024 | Package-only; post Automation 360 v.32 release | 21.252 or later | 19223 or later |
3.32.23 | 5 April 2024 | With Automation 360 v.32 (On-Premises) release | 21.252 or later | 19223 or later |
3.32.22 | 21 March 2024 | With Automation 360 v.32 (Sandbox) release | 21.252 or later | 19223 or later |
3.31.22 | 26 January 2024 | Package-only; post Automation 360 v.31 release | 21.252 or later | 19223 or later |
3.31.17 | 22 December 2023 | Package-only; post Automation 360 v.31 (Sandbox) release | 21.252 or later | 19223 or later |
3.31.16 | 6 December 2023 | With Automation 360 v.31 (Sandbox) release | 21.252 or later | 19223 or later |
3.31.15 | 28 November 2023 | With Automation 360 v.30 release | 21.252 or later | 19223 or later |
3.31.13 | 16 November 2023 | Package-only; post Automation 360 v.30 release | 21.252 or later | 19223 or later |
3.30.24 | 21 September 2023 | Package-only; post Automation 360 v.30 (Sandbox) release | 21.252 or later | 19223 or later |
3.30.22 | 6 September 2023 | With Automation 360 v.30 (Sandbox) release | 21.252 or later | 19223 or later |
3.30.21 | 21 August 2023 | Package-only; post Automation 360 v.29 | 21.98 or later | 15345 or later |
3.30.19 | 16 August 2023 | Package-only; post Automation 360 v.29 | 21.98 or later | 15345 or later |
3.29.17 | 17 July 2023 | Package-only; post Automation 360 v.29 release | 21.98 or later | 15345 or later |
3.29.14 | 6 June 2023 | With Automation 360 v.29 (Sandbox) release | 21.98 or later | 15345 or later |
- To download an individual package (updated in an Automation 360 release where you want only the package), use this
URL:
https://aai-artifacts.my.automationanywhere.digital/packages/<package-file-name>-<version.number>.jar
- For Document Extraction
package, the naming convention is:
bot-command-iqbot-extraction360-<version-number>-full.jar
For example,
bot-command-iqbot-extraction360-3.31.22-full.jar
For detailed steps on downloading a package and manually adding it to the Control Room, see Add packages to the Control Room.
3.32.26
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
Fixes |
---|
When you process a document with Google Document AI, the extraction bot now executes successfully for Portuguese language and sends the document to straight through processing (STP) or validator. |
When you process a document with handwriting or signature objects, these objects are now included in the final output JSON file. Previously, due to high confidence threshold set for signatures, handwriting or signature objects were not included in the final output JSON file. |
When you process a document using Google Custom Document Extractor (CDE) with bring your own key (BYOK) setup and the corresponding processor is using foundational model, the document processing no longer fails due to transformational failure. |
With improved table structure
model specifically for complex tables column detection, you can now get
the more accurate extraction results. Service Cloud Case ID: 02110860 |
For learning instances bridged from IQ Bot to Document Automation, when validation feedback is enabled and validation feedback is applied, and user processes the next document, the data from all the pages now extracted successfully without any merged rows. |
3.32.23
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
Fixes |
---|
Fixed the vulnerabilities reported in the security scan. |
3.32.22
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
Fixes |
---|
With improved document table
detection model that is adding End of Table
indicator, you can now extract table data from all the
pages for the selected language. Thus, it reduces missing tables and last
rows extraction issues from pages. Service Cloud Case ID: 02065073 |
With improved table extraction, unstructured tables no longer show the junk values and now extracts the table data successfully. |
Users can now save the
validation feedback in their Document Automation environment
when the proxy is enabled in the Bot Agent
machine. Service Cloud Case ID: 02092484 |
With Google Vision OCR and proxy enabled, the document extraction no longer fails for unstructured document and does not show an error message. Service Cloud Case ID: 02104409 |
3.31.22
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
Fixes |
---|
After adding validation
feedback to the learning instance, the document extraction no longer
fails with an error message. Previously, the document extraction was failing when validation check box was selected. |
After adding validation feedback
to the learning instance, the feedback is saved for all the tables across
all the pages in document and data is extracted correctly from all the
pages. Previously, the feedback was not saved for all the pages. Service Cloud Case ID: 01995135, 02093575, 02093389 |
After adding the validation
feedback, if the table IDs are matching, data from all the tables from
every page is now extracted and showing up in the
validator. Previously, in such cases, some pages were skipped and data was not showing in validator from all the pages. |
When you apply the advanced
training settings, you need to swap columns and all the column values
need to be mapped correctly. As a result, data is extracted correctly in
separate columns. You can select either to re-map all column cells or
remove all other incorrect cell rows while keeping the first two rows
intact. There should be no incorrect cells in the column and all column
cells should have the correct values. Previously, in such cases, the data from two columns was extracted in a single column. |
You can now extract the table
fields values in correct order and the multi-row extraction issue no
longer persists. Also, you can use the End of table
indicator feature to extract multi-line after applying
feedback data when there is only one row in table. Note: For single row tables, the best practice is to
use the End of table indicator feature.
Otherwise, in specific scenarios extraction might be
partial. Service Cloud Case ID: 02091013 |
After training a document, when
user processes the same document with Google Vision OCR, the
feedback gets saved and extracts the required data. Previously, in such cases, you were not able to process a specific type of document and each time required to validate the document manually. Service Cloud Case ID: 02098682 |
3.31.17
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
Fixes |
---|
With Google Vision OCR,
you can now process the documents successfully without a Google Document AI license. Also, it does not generate an error
message. Previously, it requested a Google Document AI license to process the documents and generated error while extracting documents. As a result, you were not able to extract documents with Google Vision OCR. Service Cloud Case ID: 02097428, 02096992, 02097798, 02097157, 02098378, 02098563, 02094573 |
3.31.16
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
Fixes |
---|
When users create a learning
instance with Google Document AI (BYOK) and authenticated
proxy, the document extraction no longer fails for more than 10 pages
document. Previously, in such cases, extraction failed with an error message and users were not able to process the documents. |
3.31.15
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
Fixes |
---|
If Document rules contain multiple conditions using the AND operator with (or without) a group, an appropriate error message is now displayed. Also, the corresponding action is now applied on the fields. |
3.31.13
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
What's Changed |
---|
With improved extraction of unstructured
documents in Document Automation, you can:
|
Fixes |
---|
With improved table extraction
using the ABBYY OCR engine, heuristic feedback is now working
properly.
Service Cloud Case ID: 01995901 |
When a user extracts the table
data from a PDF file where table is expanded to multiple pages, the data
from all the pages extracted successfully after applying the heuristic
feedback. Previously, users were not able to extract data from the second page of the PDF file where table is expanded to multiple pages. Service Cloud Case ID: 01996536 |
Starting the extraction
from first page for all the fields, the heuristic feedback is now working
properly for multi-line table data capturing and generates the correct
output. Previously, the multi-line table data was not extracted even after providing the heuristic feedback. As a result, the output was not generated properly. Service Cloud Case ID: 01944805, 01946809, 01952836, 01957090, 01975800, 01981088, 01944805, 01946809, 01952836, 01957090 |
For Microsoft Standard Forms, the table extraction no longer fails when cells are empty and users can extract the document successfully. |
When a user imports a leaning instance and process the documents, the extracted document shows the correct order of words for dates in all the pages. |
When a user imports a learning
instance and process the documents, all the values are displayed in the
table after extraction. Previously, in such cases, the system-identified region (SIR) was highlighted but an empty value was shown in the table. |
When a user imports a .dw file
with heuristic feedback and process a document that contains (-) value in
the last row, the documents are extracted correctly without skipping the
negative value in last row. Previously, in such cases, the last row was skipped resulting into either data loss or incorrect processing. |
When a user processes a
document that contains table, the extraction finishes successfully
without the DOCUMENT_PARTIALLY_FAILED or Extraction
Timeout error message. Previously, in such cases, some documents were not extracted because of multiple detections from the same table and caused table size (max () arg) issue. |
When a user imports a learning instance and process the documents, all the rows are extracted separately from all pages. Previously, rows from second page were merged into one row. |
Limitations |
---|
When a user uses the Google Vision OCR, the table detection or extraction will not
work. Workaround: It is recommended to use the ABBYY OCR engine. Service Cloud Case ID: 01995901 |
In specific cases, where the tables are spanned across multiple pages without headers in all the pages (header less pages), users might observe that the data is not getting extracted from all the pages after applying the feedback. |
3.30.24
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
Fixes |
---|
Users can now view the extracted data from second row correctly by using the heuristic feedback. |
For the Purchase Order document type, you can now extract the table field values correctly from all the pages. |
The generated feedback file no longer shows any error message and users can process documents successfully. |
3.30.22
- Compatible Bot Agent version: 21.252 or later
- Compatible Control Room version: 19223 or later
What's New |
---|
Document Automation provides an improved extraction through new Get document data and Update document data actions. You can use these actions to apply custom logic for data manipulation and validation to maximize straight-through processing (STP) and reduce manual verification efforts. |
3.30.21
- Compatible Bot Agent version: 21.98 or later
- Compatible Control Room version: 15345 or later
Fixes |
---|
This Document Extraction package release is a patch to fix the '501: DOCUMENT_PARTIALLY_FAILED' error that occurred while processing some documents. |
3.30.19
- Compatible Bot Agent version: 21.98 or later
- Compatible Control Room version: 15345 or later
Fixes |
---|
The Document Extraction package provides
improved extraction capability for complex table header columns.
Follow these steps to enable improved table header data
extraction:
|
3.29.17
- Compatible Bot Agent version: 21.98 or later
- Compatible Control Room version: 15345 or later
Fixes |
---|
The Document Extraction package has extraction improvement fixes for both form and table fields. |
3.29.14
- Compatible Bot Agent version: 21.98 or later
- Compatible Control Room version: 15345 or later
What's New |
---|
Document Automation provides an improved extraction through heuristic feedback with a focus on complex scenarios, such as multitables. Additionally, there are extraction improvements for both form fields and out-of-the-box performance (specifically for table fields). |