Review extraction service

  • 更新済み: 2022/01/27

    After you have confirmed that the documents you want to extract content from are standard forms, you can then plan the type of standard forms extraction service that fits your requirement.

    The Artificial Intelligence (AI) services supported for content extraction from standard forms are:

    IQ Bot extraction service

    Extraction service available in IQ Bot is developed using various OCR engines such as Microsoft Azure, Google Vision API, ABBYY FineReader Engine standard forms, and so on. This extraction service can be used to extract content from standard forms.

    Guidelines for using IQ Bot extraction service
    • Documents are of good quality (300 dpi)
    • Content in any of the document is not dense
    • Input documents do not have any handwritten copies (limited support)
    • There are no signatures
    • Contain simple table layout (span within a page) with clear header, table boundaries, and so on
    • Does not contain any tables or content that have checkboxes (limited support)
    • Does not have any repeated sections (limited support)
    Benefits of IQ Bot extraction service
    • An integrated and simple out-of-the-box setup
    • Various OCR engines to increase accuracy of extraction
    • Complex layouts (repeated sections, continuous tables etc) can be extracted for specific cases (needs testing)
    • Only requires IQ Bot license

    Microsoft Azure Form Recognizer service

    Provides Artificial Intelligence (AI) service that is very well suited for extracting content from standard forms. You can create custom models where documents can be labelled and trained.

    Guidelines for using Microsoft Azure Forms Recognizer service

    • Input documents:
      • can be dense (contain lot of details and information) and have a reasonable quality (>200 dpi)
      • can contain checkboxes and radio buttons
      • can have handwritten content
      • can contain signatures
      • can contain tables

        The input documents can also contain tables that span over a single page. However, if the standard forms contain table that span across multiple pages, the content extraction can fail.

    • None of the sections in the input documents are not repeated
    • Documents that contain transpose tables

    Benefits of Microsoft Azure Forms Recognizer service

    • Diverse standard form type documents can be processed
    • Auto detection feature can identify different types of tables such as header-less table, inverted tables, and so on
    • Good support for handwritten forms