Guidelines for extracting data from handwritten documents
- Updated: 2025/01/16
When you extract data from handwritten documents, it is important to know that these documents usually have lower data extraction accuracy than typed or printed documents.
The lower data extraction accuracy is typically due of the following reasons:
- Inconsistent character shapes and sizes
- Variable spacing between words and letters
- Overlapping or connected characters
- Usage of different types of inks and papers
- Usage of abbreviations or slang
- Smudges and corrections
- Text placement that does not follow standard formatting
Before you extract data from handwritten documents, ensure that you adhere to the following guidelines:
- Ensure that you use Google Vision OCR or Standard Forms instead of ABBYY OCR.
- If you have enabled the Generative AI-driven data extraction option, ensure that you use vision-powered generative AI models. See Vision-powered generative AI data extraction.
- If possible, use the following recommend settings when scanning and saving documents:
- Use the best DPI for scanning documents (for example, 300 DPI).
- Scan documents in grayscale or color settings.
- Avoid using aggressive compression when saving documents.