IQ Bot 11.x: List of languages in IQ Bot
- Dernière mise à jour2022/08/30
IQ Bot 11.x: List of languages in IQ Bot
There are up to 30 languages available in IQ Bot. You can also access up to 190 languages by using an OCR engine.
When you review the list of languages in IQ Bot, you
will observe the following:
- Some languages are listed multiple times as variants, for example, Norwegian, Norwegian (Bokmal), Norwegian (Nynorsk).
- Among languages that are written from right to left, such as Arabic, Aramaic, Azeri, Divehi, Fula, Hebrew, Kurdish, N'ko, Persian, Rohingya, Syriac, and Urdu, Arabic is currently supported on IQ Bot.
- For languages not in the IQ Bot UI by default:
- These rely on ABBYY FineReader Engine 12.2 for text segmentation and OCR, then IQ Bot for classification, extraction, and autocorrection.
- Contact your Cognitive Services or Sales Engineering representative to create IQ Bot custom domains to access these languages.
- In the SQL database and .json file, IQ Bot requires language codes for 160 of the additional languages to appear in the UI, and culture codes to allow numeric and date validation.
Note:
- For ABBYY FineReader Engine and Microsoft Azure Computer Vision OCR engine, IQ Bot uses its text segmentation + OCR.
- For Microsoft Azure Computer Vision OCR engine, user can select any language from IQ Bot's drop-down, but the API aims to auto-detect the language during processing, and override user selection.
Supported OCR engines
The following table provides you with links to supported languages for all IQ Bot supported OCR engines except Tesseract4 OCR:
IQ Bot supported OCR engines | Supported languages by OCR |
---|---|
Tesseract4 OCR | See table below for list of supported languages. |
ABBYY FineReader Engine | ABBYY FineReader Engine OCR supported languages |
Microsoft Azure Computer Vision OCR engine | https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/language-support |
Google Vision API | https://cloud.google.com/vision/docs/languages |
Tegaki API |
|
The following table contains the languages supported by the Tesseract4 OCR and in IQ Bot when you select Other in the
Document type field.
OCR language | Tesseract4 | IQ BotOther domain |
---|---|---|
English | X | X |
Abkhaz | ||
Adyghe | ||
Afrikaans | X | X |
Agul | ||
Albanian | ||
Altaic | ||
Armenian (Eastern) | ||
Armenian (Grabar) | ||
Armenian (Western) | ||
Avar | ||
Aymara | ||
Bashkir | ||
Basque | ||
Belarussian | ||
Bemba | ||
Blackfoot | ||
Breton | ||
Bugotu | ||
Bulgarian | X | X |
Burmese (technical preview) | ||
Buryat | ||
Catalan | X | X |
Chamorro | ||
Chechen | ||
Chinese (Simplified) | X | X |
Chinese (Traditional) | X | X |
Chukcha | ||
Chuvash | ||
Corsican | ||
Crimean Tatar | ||
Croatian | ||
Crow | ||
Czech | X | X |
Danish | X | X |
Dargwa | ||
Dungan | ||
Dutch | ||
Dutch (Netherlands) | ||
Dutch (Belgium) or Flemish | X | X |
Eskimo (Cyrillic) | ||
Eskimo (Latin) | ||
Esperanto | ||
Estonian | ||
Even | ||
Evenki | ||
Faeroese | ||
Fijian | ||
Finnish | ||
French | X | X |
Frisian | ||
Friulian | ||
Scottish Gaelic | ||
Gagauz | ||
Galician | ||
Ganda | ||
German | X | X |
German (new spelling) | ||
German (Luxembourg) | ||
Greek | X | X |
Guarani | ||
Hani | ||
Hausa | ||
Hawaiian | ||
Hungarian | X | X |
Icelandic | ||
Ido | ||
Indonesian | X | X |
Interlingua | ||
Irish | ||
Italian | X | X |
Japanese | X | X |
Kabardian | ||
Kalmyk | ||
Karachay-Balkar | ||
Karakalpak | ||
Kasub | ||
Kawa | ||
Kazakh | ||
Khakas | ||
Khanty | ||
Kikuyu | ||
Kirghiz | ||
Kongo | ||
Korean | X | X |
Korean (Hangul) | ||
Koryak | ||
Kpelle | ||
Kumyk | ||
Lak | ||
Sami (Lappish) | ||
Latin | X | X |
Latvian | ||
Latvian language written in Gothic script | ||
Lezgin | ||
Lithuanian | ||
Luba | ||
Macedonian | ||
Malagasy | ||
Malay | X | X |
Malinke | ||
Maltese | ||
Mansi | ||
Maori | ||
Mari | ||
Maya | ||
Miao | ||
Minangkabau | ||
Russian and English | ||
Mohawk | ||
Mongol | ||
Mordvin | ||
Nahuatl | ||
Nenets | ||
Nivkh | ||
Nogay | ||
NorwegianNynorsk and NorwegianBokmal | ||
Norwegian | X | X |
Norwegian (Bokmal) | ||
Norwegian (Nynorsk) | ||
Nyanja | ||
Occidental | ||
Ojibway | ||
Old English | ||
Old French | ||
Old German | ||
Old Italian | ||
Old Slavonic | ||
Old Spanish | ||
Ossetian | ||
Papiamento | ||
Tok Pisin | ||
Polish | X | X |
Portuguese | X | X |
Portuguese (Brazil) | ||
Portuguese (Portugal) | ||
Provencal | ||
Quechua | ||
Rhaeto-Romanic | ||
Romanian | X | X |
Romanian (Moldavia) | ||
Romany | ||
Ruanda | ||
Rundi | ||
Russian (old spelling) | ||
Russian | X | X |
Russian (with accents marking stress position) | ||
Samoan | ||
Selkup | ||
Serbian | X | X |
Serbian (Cyrillic) | ||
Serbian (Latin) | ||
Shona | ||
Sioux (Dakota) | ||
Slovak | X | X |
Slovenian | ||
Somali | ||
Sorbian | ||
Sotho | ||
Spanish | X | X |
Sunda | ||
Swahili | ||
Swazi | ||
Swedish | X | X |
Tabassaran | ||
Tagalog | ||
Tahitian | ||
Tajik | ||
Tatar | ||
Thai | ||
Jingpo | ||
Tongan | ||
Tswana | ||
Tun | ||
Turkish | X | X |
Turkmen | ||
Turkmen (Latin) | ||
Tuvan | ||
Udmurt | ||
Uighur (Cyrillic) | ||
Uighur (Latin) | ||
Ukrainian | ||
Uzbek (Cyrillic) | ||
Uzbek (Latin) | ||
Vietnamese | ||
Cebuano | ||
Welsh | ||
Wolof | ||
Xhosa | ||
Yakut | ||
Yiddish | ||
Zapotec | ||
Zulu |
Tip: If you are unable to see all languages in the IQ Bot UI or if IQ Bot is unable to extract data from
multiple languages in a document, troubleshoot the issue:
Unable to extract data from Multiple languages in a document (A-People login required)