Optical Character Recognition (OCR) technology became popular in the early 1990s during the digitization of historical newspapers. Before that, the only option to digitally format documents and extract data from them was to manually retype the text. This was a tedious, time-consuming, and error-prone process. OCR came in to replace manual document processing and is now most used to convert hard copy documents into an editable format.
What is Optical Character Recognition (OCR)?
OCR technology is used to automate data extraction from printed or written text from a scanned document or image file, and then convert the text into a machine-readable form to be used for downstream data processing like editing or enabling search capabilities.
For example, when you scan a form or receipt, your handheld device or computer saves the scan as an image file. Suppose you need to edit, search, or count the words in the image file—you cannot use a text editor to do that. However, you can use OCR technology to convert the image into a text document with its contents stored as text data.
In fact, OCR systems are made up of a combination of hardware and software that is used to convert physical documents into machine-readable text. The hardware includes an optical scanner or specialized circuit board that is used to copy or read text, while the software manages advanced processing.
Why does OCR fail?
Today, OCR technology has undergone several improvements and can deliver fairly accurate output. Many businesses depend on solutions built on OCR technology for document processing.
As a traditional tool that converts the data on a printed document or an image into a digitized format, OCR is a better alternative to manual processes. It works well on extracting text from documents like paper files, passports, invoices, business cards, printouts, letters, and images.
Despite how powerful it is, it is not perfect. With the high probability of data errors creeping in, the output from OCR-based data extraction solutions may not be useful for downstream enterprise business processes every time.
Even with the best-quality scanners, OCR-based solutions deliver a maximum accuracy of only 60%. Business users end up putting in more time to make manual corrections to the extracted data than the time OCR saved in extracting it.
OCR often fails because it…
- Can extract data, but not context
- Is unable to comprehend complex data — tables without borders, headers
- Cannot process documents in a variety of formats
- Sometimes Ignores varying font sizes in the same line
- Cannot decipher black gaps, garbage values, and handwritten notes
- Inability to interpret checkboxes or group of checkboxes and radio buttons
- Not able to interpret tables, paragraphs, sections.
When the going gets tough, OCR does not get going
OCR-based automated document processing solutions cannot deliver straight-through processing (STP) with accuracy because they work based on templates. That means documents must be processed in specific formats conforming to certain rules or OCR cannot extract data from them. Now, imagine a complex organization that deals with a large volume and variety of documents every day. OCR-based solutions will fail to deliver in that situation.
Extracting data from semi-structured, unstructured, and handwritten documents is tough territory for pure OCR-based solutions, and this makes them unsuitable for enterprise-grade implementation and rapid scaling.
The most significant challenge for OCR-based document processing solutions is their inability to extract context from the content. For example, if a number extracted from a table does not contain a quantifying unit (such as currency), it fails to convey the true value of that data. Once again, business users might have to spend time looking for the missing pieces of information in the original document to add value to the extracted data.
The impact of OCR errors – Accounts Payable (AP)
- Average number of characters in an invoice: 2,500
- Average time an employee takes to find and fix a data error: 3 secs
- With a 95 percent accurate OCR, characters that need manual re-checks per invoice: 125
- Time taken by an employee to manually fix an invoice: 6 minutes and 15 seconds
- The cost to manually correct a single invoice at $25 an hour: $2.56
- Annual cost of manually correcting 10,000 scanned invoices: $25,600
Intelligent Document Processing can manage scale and complexity
Intelligent Document Processing (IDP) solutions fill in all the gaps left by OCR technology, and help businesses conquer challenges of scale and complexity in data extraction.
IDP solutions combine the power of advanced cognitive technologies including Artificial Intelligence, OCR, Machine Learning and Deep Learning to process a wide variety of documents. They not only recognize, learn, and capture the content, but also deliver valuable business context. These solutions convert data to a structured form that can easily be processed by integrated downstream business systems.
JIFFY.ai’s IDP solution runs on a hybrid processing engine with self-learning machine models. This makes the system capable of handling dynamic and large volumes of documents, vendors, and formats. It extracts data accurately and quickly from multiple OCRs, fields and values, checkboxes and images, different formats, complex tables, handwritten text, address fields, camera images, various ID cards, driving licenses, receipts, and much more. So, enterprise teams can use it to derive actionable business insights from the data faster and more efficiently.