The intelligent data capture journey with Brainware

How intelligent capture works

When you first saw the jumbled mess of characters above, your brain probably lit up as you tried to make sense of it. But pretty quickly, your mind picked up the patterns and started translating it into the coherent thought it is.

That’s what Brainware Intelligent Capture, a Hyland intelligent capture platform, is all about: Getting a look at something, making sense of it, and bringing the data, contextually, into coherence.

Like the human brain, Brainware Intelligent Capture leverages patterns and context within a page to recognize and interpret information. This increases the speed and accuracy of the automated data capture process without the need to create templates or zones.

Pattern recognition is just one of the intelligent capabilities built into Brainware that make it a unique and powerful capture tool. From classification and extraction to validation, Brainware leverages a neural network of intelligent algorithms that:

  1. Improve image quality
  2. Learn over time
  3. Understand imperfection

Let’s explore each layer of the intelligent capture platform as we delve into the inner workings of Brainware.

Brainware in action: See how global powerhouse Siemens gained invoice efficiencies and reduced the need for manual intervention by 90%.

Start with an intelligently processed image

Brainware’s first layer of intelligence improves the quality of document images before any optical character recognition (OCR), classification or extraction even takes place. Why? Because many documents contain watermarks or other objects printed in the background that can obscure the text that you need to extract. If a document is difficult to read from the get-go, your OCR engine is less likely to be able to extract and leverage the correct data.

Think about it this way: Have you ever watched someone use a whiteboard that had the remnants of notes that were left on the board too long? The presenter can write clearly over the mess, but to viewers throughout the room, the board is just a chaotic string of words. Deriving value from it is too difficult with the amount of “noise.”

Eliminating noise to perfect the captured image

Imagine that same cluttered whiteboard.

If Brainware were to process it, the intelligent capture platform would clean up the noisy background and conflicting data to make the important text clearer and easier to read.

The quality of the image is critical to the intake of data, and it can impact every step of an intelligent automation project. Reading the wrong data from the beginning only leads to errors and extra steps down the road that require human correction and intervention — aka, bottlenecks.

By starting the capture process with a higher quality image, Brainware increases the accuracy and thus the speed of subsequent steps in the process, including document classification and data extraction. It’s a critical first step and yet just scratches the surface of the cognitive capabilities within Brainware intelligent capture.

Continually learning like the human brain

By adapting to variation and learning over time, Brainware Intelligent Capture generates increasingly clean and accurate data to fuel intelligent automation.

First-generation intelligent capture software — mainly associated with scanners, multi-function printers, and optical character recognition — has been around for decades. Generally, these tools facilitate the transition of physical documents to fixed images from which OCR software then reads and extracts text.

The problem with these 1.0 intelligent capture platforms and OCR is that they need very specific conditions to be effective at reading text, and they can only read machine-printed text, maybe even only in specific fonts. To find the right text to extract and appropriately index a document, users have to create a specific template for each document type or identify zones on the page where the OCR should pick up the text.

In other words, more work for your busy team, with little room for variation.

Using machine learning to evolve intelligent capture

Brainware Intelligent Capture is boosted by technology like intelligent algorithms and machine learning (ML), a type of AI that uses patterns and inference to complete tasks, analyze the results after each execution and to measure the algorithms’ accuracy to make improvements automatically. In practice, Brainware learns from the documents it reads and processes to improve the accuracy of classification and extraction over time, so results of your intelligent automation project get more and more accurate as more data is fed into it.

Based on the data pulled from documents, intelligent capture tools can even trigger downstream tasks.

In the case of accounts payable, for example, Brainware can automatically flag a staff member to create a new vendor record or post final transaction data to an ERP.

Put simply, intelligent capture is the starting point for generating clean and accurate data to fuel your intelligent automation project.

Putting the power of recognition into your processes

As humans, we use patterns to make sense of our world nearly every day. It’s the basis of our learning — just watch a child learn to recognize and react to objects. It’s in

our nature to identify the patterns around us, then associate those patterns with objects or concepts, and respond accordingly based on what we’ve learned from experience.

Brainware approaches intelligent data capture in a similar fashion — through pattern recognition.

When it looks at a document, it doesn’t look for specific positions or zones that data are in as a template-based solution would. Rather, it considers all the data on a given document and looks for patterns of information on a page as well as relationships between words.

From the patterns that emerge, Brainware learns what uniquely classifies a document type and which data it should extract. By seeing where clusters of tabular data are, it can identify whether a document is an invoice vs. a transcript, for example, and then focus on those tables for extraction.

After reviewing just a small set of documents, Brainware understands what patterns classify different document types and then applies that knowledge to classify new documents going forward.

Supervised machine learning

Hyland also recently began testing its new Automated Learning Engine (ALE) to further enhance Brainware with supervised machine learning capabilities.

What is supervised machine learning?

In the simplest terms, supervised machine learning is when the machine or software learns what to do based on example inputs and outputs provided by humans. When the machine receives new information, it refers to the training data that has been provided and infers a response.

An early release test version of the ALE is available in the latest release of Brainware for Invoices. The ALE extracts header-level data from invoices and learns from any corrections end users make during the verification process. As users review, verify or correct extracted data, those corrections are applied to a training set from which the ALE learns to extract the correct data from future invoices.

It’s this combination of pattern recognition and machine learning that makes Brainware Intelligent Capture so powerful. In addition to expediting and continuously improving the accuracy of the capture process, the advanced learning capabilities allow users to extract data from all kinds of information sources and handle real-world variation in a way that previous generations of capture tools simply can’t.

Intelligent capture that understands imperfection


Brainware leverages a neural network of 13 different engines and algorithms that work like a human brain. It learns conceptually and responds with flexibility when classifying and extracting data, without memorization or templates.

This flexibility allows Brainware to cope with variance across documents, whether by cleaning up an image from the start or by understanding imperfections within the text to be read and extracted.

Nobody’s perfect, and neither are documents

As documents come in and data is extracted, many capture tools (including Brainware) can validate that information against other systems of record to ensure that first, it’s extracted correctly, and second, the right information is sent downstream for processing.

But what happens when there’s a typo or inconsistency? What happens when the vendor address, for example, says “Mian St” instead of “Main Street”?

Using intelligence called fault-tolerant search, Brainware understands when information is misspelled or slightly different, and it can still make matches to core systems and master data. It does this by searching for pieces of words known as trigrams, which are often used in natural language processing.

For example: The trigrams for the word “Hyland” are “Hyl,” “yla,” “lan” and “and.” Brainware applies an algorithm to these trigrams to create a search pool of the known data. As Brainware performs its searches and comparisons against the master data in the core system, it uses the trigrams — not entire words — to identify matches.

Because it’s only looking for pieces of each word and not the word itself, Brainware increases the chances of finding the right match. It’s this human-like deduction that allows Brainware to search using large sections of the document’s optical character recognition words, validate extracted data or even provide additional content not found on the document (i.e., patient ID), despite abbreviations, OCR errors, typos or misspellings in the document’s content.

Brainware’s ability to overcome document imperfections reduces the need for human intervention and validation, while speeding up the information capture process without sacrificing accuracy.

The brains behind Brainware: Neuroscientists

The engines and algorithms powering Brainware come from a dedicated team of scientists with backgrounds in neuroscience, physics and engineering. These experts leveraged their knowledge of the human brain and sensory processing, as well as close to 20 years of software development experience, to develop and continually enhance the intelligent layers within Brainware.

Combined, these layers of image optimization, learning through pattern recognition and the ability to understand imperfection deliver truly human-like intelligence to this intelligent capture software.

Learn more about what’s behind Brainware here.

Danielle Simer is a marketing portfolio manager at Hyland. Her mission is to share best practices and evangelize the power of enterprise content management (ECM) as a tool to automate paper-based processes and improve operations across accounting and finance, human resources, and contract management. Danielle joined Hyland after more than six years with a research and advisory firm devoted to helping senior executives manage their departments and teams more effectively. She received her bachelor’s degree from The Ohio State University and her MBA from Georgetown University’s McDonough School of Business.
Danielle Simer

Danielle Simer

Danielle Simer is a marketing portfolio manager at Hyland. Her mission is to share best practices and evangelize the power of enterprise content management (ECM) as a tool to automate... read more about: Danielle Simer