Create a learning instance
- Updated: 2024/07/31
Create a learning instance
Create a learning instance and upload sample documents for training. In this step, you define the data elements for a single document type, such as an invoice or a purchase order, and the fields which you want to extract.
Prerequisites
- Each document is a separate file. For example, if you have downloaded an email and its attachments into a single PDF, you must separate the email body from the attachments. See Using the Split document action.
- The documents are in one of the following supported file types:
- JPG
- JPEG
- PNG
- TIFF
- Use documents with a resolution value of at least 300 dots per inch (dpi).
- In staging, you can upload a maximum of 150 documents of 10 MB file size per learning instance.
- In production, you can upload a maximum of 50 MB file size per document. However, the maximum number of documents allowed per learning instance depends on the license.
- There are no limitations on the number of pages per document in a pdfbox OCR.
- You can upload 60 pages per document in an image-based OCR.
- You can upload up to a file size of 12 MB. You can upload additional documents after creating the learning instance.
- The file names of the documents that you upload should not start with special characters, such as the hyphen (-).
- If the text that you want to extract starts with any of the following special characters: ‘# : , \ ` '', the special characters are ignored by IQ Bot when capturing the text.
- With the Tesseract4 OCR, currently there is a known limitation which restricts the number of pages per document to less than 60 pages.
- Azure confidential computing enables organizations to upload encrypted data to secured storage, such as private folders on a virtual machine. If you upload documents from such secured folders to IQ Bot, these are moved to Unclassified status as data extraction is not supported for such documents.
When you start with a collection of documents to insert into a digital process, you will probably have a mix of documents types, formats, and orientations. An invoice, for example, has a consistent set of data elements, whereas a purchase order contains a different set of data elements. You must create a different learning instance for each of these document types, using the following steps:
Procedure
Next steps
After the Classifier finishes sorting the documents, you are redirected to the Designer, where you will train bots to extract data from each sample document. Train a learning instance.