The keys to successful document capture automation

Organizations today are dealing with a growing influx of data. And much of that data is coming in from documents, with a surprising amount still entering organizations through the mail.

Our expectations – and our customers’ expectations – are also increasing in terms of the speed at which information flows through the organization.

An obvious response to this problem is to look for technology that can help capture this content from wherever it enters the organization, extract relevant data, and then push that information and content into the appropriate systems (and therefore into the hands of people who need it) as quickly as possible.

But before you invest in a capture technology solution, it is important to understand the key concepts of how the solution works and how success is best measured.

The stages of a document capture solution

1. Acquisition

First, it is important to understand where content is coming from. Decades ago, the majority of information would come in through physical mail and required scanning; but in 2019 we are seeing a plethora of avenues by which information enters an organization.

It is incredibly important to empower staff to capture data where and when information comes in, whether that is while they are in the field on a mobile device, through scanners at their desks, or directly from various digital applications they might be using, like email.

2. Classification

In order to understand the important information that needs to be extracted from the incoming documents and placed in other systems, or where the document needs to go next for processing, it is important that the software is able to identify the document type first  ̶  which could require a variety of methods depending on your processes in place.

3. Data extraction

Once the document is identified, the software can then determine where on the page to look for specific information that needs to be pulled. Depending on the intelligence of the software, it either requires a predetermined template for each document type – a map of sorts for each document type to understand where information lives – or it can simply understand from context and content of the document, just like we do.

Key term: optical character recognition (OCR).

If you have been researching document or data capture technology for some time, you are probably familiar with OCR – it is the engine underneath the covers of a capture solution that “reads” a piece of content and turns it into text. But it is important you understand that OCR can only pull individual characters, and does not include the intelligence to understand the text it is reading – this is where each vendor and each capture solution come in and make the difference.

4. Data validation

Before this data should be pushed into any system, before it is used to make key business decisions, it is critically important that it be validated. Depending on the solution and the specific document, the data could be validated by rules built into the software.

For instance, if there is always a specific format of a data field like date or social security number, or the software can intelligently interact with other systems to check pulled data with already existing data, such as vendor data on an invoice.

5. Data and document delivery

Now the data and the original document can be pushed to content or information management systems for further processing and broader access.

The key factors of success

Now that we’ve walked through the stages of a capture process, it is just as important to understand what truly makes a solution successful. You might be thinking success would only be possible if you implemented a software solution that could do all of this without any staff oversight 100 percent of the time.

Chances are, that is not realistic.

But before you decide to give up your search for a document or data capture solution, let’s walk through some key factors of success, as you can reduce labor costs on this process drastically and even improve customer and employee experience  ̶  as long as you implement the right solution.

1. Image quality

Think about how easy (or not!) it is to read faxes with ink bleeds and marks all over. Here is a rule of thumb: if a human cannot read a document, neither can technology – it is magical but not that magical.

Digitally created documents are generally the best quality, making it easier for the OCR engine to read the characters. For documents that need to be scanned, we recommend keeping your scanners clean to limit the amount of marks that make it onto the document between receipt and capture processing. If you do receive documents through fax, look for a capture solution that can pull the image directly from the fax server instead of requiring you to scan it in – this can help immensely with image clarity.

2. OCR accuracy

The quality of the image impacts the ability of the OCR engine to read the characters on the page, but as discussed above, that only takes you so far in terms of data accuracy. Once characters have been extracted, it takes contextual understanding and additional intelligence to determine what data is actually present and if that data is correct. Again, this is where each vendor’s specific solution comes into play, so make sure to ask about the intelligence included in each solution beyond the OCR engine’s capabilities.

3. Impact of YOUR needs

The quality of the images, avenues of entry into your organization, and the types of documents and data you are processing all impact the kind of solution you need. You may think that if you just invest in the most expensive and extensive capture solution, your needs will be met every time. That is not always the case.

If you have smaller volumes of documents or only specific documents you are looking to process, you may be over-investing in a solution that will only bring so much return for such a simple problem.

For instance, if you are trying to process internal forms, before you look into an intelligent data capture solution described above, think about turning paper or document-based forms into completely electronic forms. This could not only could give an even better return on investment, it might greatly improve the experience for the customer, vendor, or employee completing the form.

Ready to learn more?

With all this in mind, the solution to your problem may not be simple enough for you to find on your own, but should instead include comprehensive conversations both internally to understand the problem and with leading vendors to find the best solution.

Luckily, we have capture technology experts ready and willing to continue the capture conversation.

Over the last few years working at Hyland, creator of OnBase, Jaclyn has definitely started to drink the Kool Aid – day and night enthusiastically discussing the wonderful benefits of OnBase with fellow Hylanders, family, friends, and even complete strangers. Her graduation from the University of Rochester with a major in economics, minor in film studies and concentration in neurological science only goes to show how vast her interests are. With that in mind, it is no surprise she truly enjoys working to market OnBase across an equally vast number of industries – some even mirroring her academic interests (financial services, arts and entertainment and healthcare/sciences) – as a member of the Product Marketing team at Hyland.
Jaclyn Inglis Clark

Jaclyn Inglis Clark

Over the last few years working at Hyland, creator of OnBase, Jaclyn has definitely started to drink the Kool Aid – day and night enthusiastically discussing the wonderful benefits of... read more about: Jaclyn Inglis Clark