Whoa whoa,.. first thing's first; what's OCR?
OCR is an acronym for Optical Character Recognition
; a term used to describe software that can extract usable text from images (usually scans, faxes or photographs).
Up until this morning I was not aware that Microsoft Office included a tool called Microsoft Office Document Imaging
(MODI) which contains OCR features that allow you extract text from TIFF or MDI files.
MODI has been a part of the Office suite since Office XP was released in 2001. If you are using Office XP, 2003 or 2007 you may already have MODI installed. Check in the "tools" sub folder of the "Microsoft Office" directory in your start menu. (see pic)
If MODI is not installed grab your Office installation disc and select the "add/remove features" tab when it loads. MODI is listed under "tools" and needs to be set to "run from my computer". It should only take a few minutes to install.
For more information on Microsoft Office Document Imaging and how to use the OCR features check out the following resources:
- About Microsoft Office Document Imaging
- About indexing text in TIFF and MDI files