What is Optical Character Recognition? (OCR)

Optical character recognition technology, often abbreviated to OCR technology, is a method of data extraction used by document capture software tools. It allows documents to be “read” by the computer and convert data into indexable and searchable content.

How does PDF OCR work?

Once a document is scanned, it can be filtered through an OCR software package and displayed to you in searchable format. The software does this by using recognition algorithms to index letters and numbers. Once this is done, you can search for specific information on the document in order to identify it or retrieve it.

For example, you might have a series of documents that contain an invoice number of 6 digits. One thing you might want to do is index the invoice numbers and save as the filename. Optical character recognition technology can automate this for you, once set up.

Are there different types of PDF OCR?

Yes, there are differing levels of OCR capabilities, depending on the software you select. These days, most document scanners actually provide some basic level of optical character recognition within their bundled capture software.

  • Barcode OCR – Ability to read numbers and information within barcodes
  • Zonal OCR – Ability to draw a box around a part of a document and read that section (useful for scanning documents of the same format)
  • Full Text – Reads the whole page and indexes each digit (very useful for archivists)

For example, Epson’s Document Capture Pro is supplied with all Epson Workforce document scanners, whilst the Fujitsu ScanSnap range is bundled with ABBYY FineReader for ScanSnap. Document Capture Pro and ABBYY FineReader offer both zonal and barcode OCR.

Full Text OCR

Full Text OCR is not provided as a free tool, as it is a premium feature in scanning. Whilst it obviously creates files that are much larger in terms of memory size, it can be a very powerful tool for certain businesses.

FileDirector offers a Full Text OCR module as an optional extra. This is ideal for libraries, archives and other industries with a need to store historical documents or records for years at a time. These industries often include architects and engineeringĀ firms; lawyers & solicitors, and medical practitioners.

If you would like to learn more about FileDirector document management system and its endless capabilities, please feel free to contact us. We have a proven track record of improving the document workflow management of business in varied sectors, saving them both time and money.