IQ Bot 11.x: List of languages in IQ Bot

There are up to 30 languages available in IQ Bot. You can also access up to 190 languages by using an OCR engine.

When you review the list of languages in IQ Bot, you will observe the following:
  • Some languages are listed multiple times as variants, for example, Norwegian, Norwegian (Bokmal), Norwegian (Nynorsk).
  • Among languages that are written from right to left, such as Arabic, Aramaic, Azeri, Divehi, Fula, Hebrew, Kurdish, N'ko, Persian, Rohingya, Syriac, and Urdu, Arabic is currently supported on IQ Bot.
  • For languages not in the IQ Bot UI by default:
    • These rely on ABBYY FineReader Engine 12.2 for text segmentation and OCR, then IQ Bot for classification, extraction, and autocorrection.
    • Contact your Cognitive Services or Sales Engineering representative to create IQ Bot custom domains to access these languages.
    • In the SQL database and .json file, IQ Bot requires language codes for 160 of the additional languages to appear in the UI, and culture codes to allow numeric and date validation.
Note:
  • For ABBYY FineReader Engine and Microsoft Azure Computer Vision OCR engine, IQ Bot uses its text segmentation + OCR.
  • For Microsoft Azure Computer Vision OCR engine, user can select any language from IQ Bot's drop-down, but the API aims to auto-detect the language during processing, and override user selection.

Supported OCR engines

The following table provides you with links to supported languages for all IQ Bot supported OCR engines except Tesseract4 OCR:
IQ Bot supported OCR engines Supported languages by OCR
Tesseract4 OCR See table below for list of supported languages.
ABBYY FineReader Engine ABBYY FineReader Engine OCR supported languages
Microsoft Azure Computer Vision OCR engine https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/language-support
Google Vision API https://cloud.google.com/vision/docs/languages
Tegaki API
  • Japanese
  • Korean
  • Japanese - English
  • Korean - English
The following table contains the languages supported by the Tesseract4 OCR and in IQ Bot when you select Other in the Document type field.
OCR language Tesseract4 IQ BotOther domain
English X X
Abkhaz
Adyghe
Afrikaans X X
Agul
Albanian
Altaic
Armenian (Eastern)
Armenian (Grabar)
Armenian (Western)
Avar
Aymara
Bashkir
Basque
Belarussian
Bemba
Blackfoot
Breton
Bugotu
Bulgarian X X
Burmese (technical preview)
Buryat
Catalan X X
Chamorro
Chechen
Chinese (Simplified) X X
Chinese (Traditional) X X
Chukcha
Chuvash
Corsican
Crimean Tatar
Croatian
Crow
Czech X X
Danish X X
Dargwa
Dungan
Dutch
Dutch (Netherlands)
Dutch (Belgium) or Flemish X X
Eskimo (Cyrillic)
Eskimo (Latin)
Esperanto
Estonian
Even
Evenki
Faeroese
Fijian
Finnish
French X X
Frisian
Friulian
Scottish Gaelic
Gagauz
Galician
Ganda
German X X
German (new spelling)
German (Luxembourg)
Greek X X
Guarani
Hani
Hausa
Hawaiian
Hungarian X X
Icelandic
Ido
Indonesian X X
Interlingua
Irish
Italian X X
Japanese X X
Kabardian
Kalmyk
Karachay-Balkar
Karakalpak
Kasub
Kawa
Kazakh
Khakas
Khanty
Kikuyu
Kirghiz
Kongo
Korean X X
Korean (Hangul)
Koryak
Kpelle
Kumyk
Lak
Sami (Lappish)
Latin X X
Latvian
Latvian language written in Gothic script
Lezgin
Lithuanian
Luba
Macedonian
Malagasy
Malay X X
Malinke
Maltese
Mansi
Maori
Mari
Maya
Miao
Minangkabau
Russian and English
Mohawk
Mongol
Mordvin
Nahuatl
Nenets
Nivkh
Nogay
NorwegianNynorsk and NorwegianBokmal
Norwegian X X
Norwegian (Bokmal)
Norwegian (Nynorsk)
Nyanja
Occidental
Ojibway
Old English
Old French
Old German
Old Italian
Old Slavonic
Old Spanish
Ossetian
Papiamento
Tok Pisin
Polish X X
Portuguese X X
Portuguese (Brazil)
Portuguese (Portugal)
Provencal
Quechua
Rhaeto-Romanic
Romanian X X
Romanian (Moldavia)
Romany
Ruanda
Rundi
Russian (old spelling)
Russian X X
Russian (with accents marking stress position)
Samoan
Selkup
Serbian X X
Serbian (Cyrillic)
Serbian (Latin)
Shona
Sioux (Dakota)
Slovak X X
Slovenian
Somali
Sorbian
Sotho
Spanish X X
Sunda
Swahili
Swazi
Swedish X X
Tabassaran
Tagalog
Tahitian
Tajik
Tatar
Thai
Jingpo
Tongan
Tswana
Tun
Turkish X X
Turkmen
Turkmen (Latin)
Tuvan
Udmurt
Uighur (Cyrillic)
Uighur (Latin)
Ukrainian
Uzbek (Cyrillic)
Uzbek (Latin)
Vietnamese
Cebuano
Welsh
Wolof
Xhosa
Yakut
Yiddish
Zapotec
Zulu
Tip: If you are unable to see all languages in the IQ Bot UI or if IQ Bot is unable to extract data from multiple languages in a document, troubleshoot the issue:

Unable to extract data from Multiple languages in a document (A-People login required)