Blog

Overcoming OCR Errors and Limitations with Intelligent Document Processing

Written by Sudhir Sen, VP of Products, JIFFY.ai, | Updated on September 18, 2023

Optical Character Recognition (OCR) technology became popular in the early 1990s during the digitization of historical newspapers. Before that, the only option to digitally format documents and extract data from them was to manually retype the text. This was a tedious, time-consuming, and error-prone process. OCR came in to replace manual document processing and is now most used to convert hard copy documents into an editable format.

What is Optical Character Recognition (OCR)?

OCR technology is used to automate data extraction from printed or written text from a scanned document or image file, and then convert the text into a machine-readable form to be used for downstream data processing like editing or enabling search capabilities.

For example, when you scan a form or receipt, your handheld device or computer saves the scan as an image file. Suppose you need to edit, search, or count the words in the image file—you cannot use a text editor to do that. However, you can use OCR technology to convert the image into a text document with its contents stored as text data.

In fact, OCR systems are made up of a combination of hardware and software that is used to convert physical documents into machine-readable text. The hardware includes an optical scanner or specialized circuit board that is used to copy or read text, while the software manages advanced processing.

Why does OCR fail?

Today, OCR technology has undergone several improvements and can deliver fairly accurate output. Many businesses depend on solutions built on OCR technology for document processing.

As a traditional tool that converts the data on a printed document or an image into a digitized format, OCR is a better alternative to manual processes. It works well on extracting text from documents like paper files, passports, invoices, business cards, printouts, letters, and images.

Despite how powerful it is, it is not perfect. With the high probability of data errors creeping in, the output from OCR-based data extraction solutions may not be useful for downstream enterprise business processes every time.

Even with the best-quality scanners, OCR-based solutions deliver a maximum accuracy of only 60%. Business users end up putting in more time to make manual corrections to the extracted data than the time OCR saved in extracting it.

OCR often fails because it…

Can extract data, but not context
Is unable to comprehend complex data — tables without borders, headers
Cannot process documents in a variety of formats
Sometimes Ignores varying font sizes in the same line
Cannot decipher black gaps, garbage values, and handwritten notes
Inability to interpret checkboxes or group of checkboxes and radio buttons
Not able to interpret tables, paragraphs, sections.

When the going gets tough, OCR does not get going

OCR-based automated document processing solutions cannot deliver straight-through processing (STP) with accuracy because they work based on templates. That means documents must be processed in specific formats conforming to certain rules or OCR cannot extract data from them. Now, imagine a complex organization that deals with a large volume and variety of documents every day. OCR-based solutions will fail to deliver in that situation.

Extracting data from semi-structured, unstructured, and handwritten documents is tough territory for pure OCR-based solutions, and this makes them unsuitable for enterprise-grade implementation and rapid scaling.

The most significant challenge for OCR-based document processing solutions is their inability to extract context from the content. For example, if a number extracted from a table does not contain a quantifying unit (such as currency), it fails to convey the true value of that data. Once again, business users might have to spend time looking for the missing pieces of information in the original document to add value to the extracted data.

The impact of OCR errors – Accounts Payable (AP)

Average number of characters in an invoice: 2,500
Average time an employee takes to find and fix a data error: 3 secs
With a 95 percent accurate OCR, characters that need manual re-checks per invoice: 125
Time taken by an employee to manually fix an invoice: 6 minutes and 15 seconds
The cost to manually correct a single invoice at $25 an hour: $2.56
Annual cost of manually correcting 10,000 scanned invoices: $25,600

Intelligent Document Processing can manage scale and complexity

Intelligent Document Processing (IDP) solutions fill in all the gaps left by OCR technology, and help businesses conquer challenges of scale and complexity in data extraction.

IDP solutions combine the power of advanced cognitive technologies including Artificial Intelligence, OCR, Machine Learning and Deep Learning to process a wide variety of documents. They not only recognize, learn, and capture the content, but also deliver valuable business context. These solutions convert data to a structured form that can easily be processed by integrated downstream business systems.

JIFFY.ai’s IDP solution runs on a hybrid processing engine with self-learning machine models. This makes the system capable of handling dynamic and large volumes of documents, vendors, and formats. It extracts data accurately and quickly from multiple OCRs, fields and values, checkboxes and images, different formats, complex tables, handwritten text, address fields, camera images, various ID cards, driving licenses, receipts, and much more. So, enterprise teams can use it to derive actionable business insights from the data faster and more efficiently.

Unlock the potential of AI-powered transformation. Talk to one of our experts today.

Talk to an expert

Also read:

Blog

What Is Intelligent Document Processing?

Written by Sudhir Sen, VP of Products, JIFFY.ai, | Updated on September 18, 2023

How can you guide your organization through digital transformation when approximately 80% of business data still exists in unstructured forms such as emails, images, and PDFs?

Yes, you need a tool to quickly digitize all these documents with minimal manual effort. Intelligent document processing enables this and helps you automate document-related business processes at scale. Here’s how.

Intelligent document processing (IDP) is defined as a set of tools powered by Artificial Intelligence (AI), Machine Learning (ML), Optical Character Reading (OCR) and other technologies that can convert unstructured, semi-structured, and structured documents into machine-readable data, which is the foundation of business process automation.

Industries and enterprise functions that rely heavily on documents, such as banks, schools, healthcare institutions, HR and Finance & Accounting can save tremendous amounts of time, effort, and investment using IDP.

IDP’s key benefits include:

Thousands of work hours saved per employee per year
Reduced error rates
Reduced operational and human resources costs
Faster document processing at scale
Standardization of processes over time
Happier employees, as they focus more on value-generating tasks

For instance, one of our clients, a leading automobile manufacturer, was able to achieve 85% straight through processing over a 12-week period across a volume of 150,000 invoices per month for 5,000 suppliers using our invoice processing HyperApp. The HyperApp that has built-in intelligent document processing capabilities helped their AP team to cut the time needed to process one invoice from 24 hours to just 3 minutes. The solution helped automate 90% of their invoice processing.

IDP can drive these outsized benefits due to its key advantages over traditional document processing automation solutions.

Intelligent Document Processing vs. Automated Document Processing

IDP improves upon pure ML-based document processing solutions in four ways.

	Automated Document Processing	Intelligent Document Processing
Touchless rate	The ML component predicts the data from most of the fields, but some extractions still have to be done manually. (Eg: Data from tables inside tables)	IDP learns all the data extraction rules based on human inputs, and then makes automatic corrections over time.
OCR accuracy	OCR accuracy is low, as the system can convert domain-specific labels like “street”, but falters on dynamic values.	IDP uses both standard OCR and visual attention-based OCR to recognize all values in a document and extract data accurately.
Tech involvement	Data science team might have to pitch in to train the ML model for new document formats.	IDP typically has a GUI that allows business users to set up new document formats, templates, and workflows. No IT involvement.
Adaptive nature	A new ML model resets all earlier formats.	IDP framework ensures that each new model only improves the document extraction accuracy.

What is Document Processing Software? IDP Software Explained, with an Example

An IDP software is an application that packages all the capabilities mentioned above (low-touch, GUI-based, AI-powered, and adaptive), into a single, business-user-friendly platform. For example, JIFFY.ai offers a hybrid IDP software that can handle heterogeneous documents and data formats using both ML and rules-based processing, along with sophisticated OCR. Using our intelligent document processing software, you can:

Process a variety of documents, involving complex tables, tables with/without lines, multi-page documents, etc.
Extract data from various ID card formats, receipts, driver’s licenses, and other similar documents
Automatically extract and feed data to the destination applications, such as CRM, ERP, etc.
Easily define and train new ML models for unfamiliar document formats
Handle exceptions and automate document processing-related activities at scale, even when there are thousands of types of documents involved

The JIFFY.ai Approach to Document Processing: Efficient and Future-Ready

As companies continue to embrace and progress digital transformation rapidly, the efficiency gains offered by IDP will make it an enterprise staple and elevate employee experiences by eliminating tedious repetitive work.

JIFFY.ai adopts a hybrid approach to IDP so you can gain from AI’s predictive capabilities while learning from human inputs when exceptions arise. This places our document processing software ahead of most industry peers in terms of accuracy, scope, and user support. For example, JIFFY.ai extracts text inside complex tables with 15-20% more accuracy compared to competitors.

With true touchless processing and usage-based SaaS pricing (you pay only for the volume of documents processed), JIFFY.ai’s Intelligent Document Processing solution helps to defragment data extraction from myriad documents spread across the enterprise, and thus changes the paradigm of enterprise automation, thereby accelerating innovation.

Learn more about the IDP core that powers our Invoice Processing HyperApp.

Unlock the potential of AI-powered transformation. Talk to one of our experts today.

Talk to an expert