H0W 1N73LL1G3NT C4PTUR3 W0RK5: P4RT 1 —1M4G3 0P71M1Z4710N How does intelligent capture work? I think you already know.


70 1NCR3453 7H3 5P33D 4ND 4CCUR4CY 0F D4T4 C4PTUR3, TH3 8U55IN355 VV0RLD 15 1NCR3451NGLY L00K1NG 70 1NT3LL1G3NT C4P7UR3.

BU7 VVH47 15 17? 4ND H0VV D035 1T VV0RK?

1 7H1NK Y0U 4LR34DY KN0VV.

Context is key with intelligent capture

When you started reading this, it took a second. But your mind quickly picked up the patterns and started automatically modifying the words — without even thinking about it.

That’s what intelligent capture (also called ‘cognitive capture’) is all about. Like the human brain, Brainware intelligent capture leverages patterns and context within a page to recognize and interpret information. This increases the speed and accuracy of the automated data capture process — without the need to create templates or zones.

Pattern recognition is just one of the intelligent capabilities built into Brainware that make it a unique and powerful capture tool. From classification, to extraction and validation, Brainware leverages a neural network of intelligent algorithms that:

  1. Improve image quality
  2. Learn over time
  3. Understand imperfection

In this series we’ll explore each layer of intelligence, taking you inside the mind of Brainware.

Image Optimization — cleaning up the noise

The first layer is all about improving the quality of document images before any OCR, classification or extraction even take place. Many document types contain watermarks or other objects printed in the background that can obscure the text that you need to extract. If a document is difficult to read from the get-go, the less likely it is that an OCR engine is going to be able to recognize the correct data.

Think about it this way. Do you remember going to meetings in person? In a conference room?

Think back to those times where you tried to use the whiteboard, but it was stained with a swirl of colors from the hastily wiped notes of previous meetings (or notes that were left on too long). You tried to write clearly overtop of the mess, pressing that too-dry dry-erase marker as hard as you could — but to the people in the back of the room the board was just noise.

Easier to read

Now think of a document like an overused whiteboard. Brainware cleans up those noisy backgrounds to make the important text more clear and easier to read.

Using an advanced binarization algorithm, Brainware analyzes the pixel values within a document image to determine if the pixel should be black or white. This preserves the data on the document you need to read, like the student ID and course numbers on a transcript. But it also suppresses what you don’t need, like a university’s watermark.

Reading the wrong data from the beginning only leads to errors and extra steps down the road that require human correction and intervention. Also known as delays.

By starting the capture process with a higher quality image, Brainware increases the accuracy and thus the speed of subsequent steps in the process including document classification and data extraction. It’s a critical first step and yet just scratches the surface of the cognitive capabilities within Brainware intelligent capture.

Stay tuned for our next blog post when we’ll talk about automated learning. Until then, you can learn more about the intelligent algorithms powering Brainware by clicking here.

Danielle Simer

Danielle Simer is a marketing portfolio manager at Hyland. Her mission is to share best practices and evangelize the power of enterprise content management (ECM) as a tool to automate... read more about: Danielle Simer

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.