Blackburn Group, Inc. Contact Us  | Home  | Products  | Services   

Our Products   

Image Management Process Description
 

Document Imaging

Document imaging has been well accepted as a way of reducing business costs and automating routine clerical tasks. Today, new technologies are reducing the cost of imaging and making it affordable for most organizations.

The trend in document imaging is moving away from proprietary turnkey systems. It is now common to base sophisticated production-imaging systems on low-cost networks of Windows® PCs and standard server operating systems. Readily available applications and databases are dramatically lowering the price of storing indexed information for multi-million file installations.

These trends are allowing large-scale production imaging systems to be built out of low-cost software components from multiple vendors. Blackburn Group, Inc. will be utilizing this approach to integrate all production imaging systems.

Component Imaging

Modern PC-based imaging systems are increasingly being built out of reusable software components. In this component imaging model, production imaging systems are built out of five components.

Component Function
Capture Converts paper documents into images. Generally includes batch preparation, scanning, image processing, QA, indexing, rescan, and export of images and indexes to long term storage.
Storage Permanently stores images on optical drives and jukeboxes. Includes platter management, volume management, and hierarchical storage management (HSM).
Retrieval/View Allows document images to be retrieved, via keyword searches (using the indexes stored during capture) or full-text indexing, then displayed and annotated.
Document Management Provides centralized management and administration of large volumes of documents. Typically provides a file cabinet and file folder metaphor for retrieving documents.
Workflow Provides automation of routine work processes, usually by automatic routing of images as a replacement for manual routing of paper.

Component imaging offers three primary benefits over turnkey systems:

Flexibility

Using components will allow us to customize the risk management information system more easily than with a turnkey system. Component technology will result in a system that incorporates a document capture system with our RiskPro application.

Low cost

Competition and higher volumes are driving down component pricing. In addition, Blackburn Group, Inc. will integrate component systems that are less costly than an imaging vendor’s professional services group.

Ability to Upgrade

Component systems are easier to upgrade than large turnkey systems. For example, upgrading an image viewer is relatively straightforward. Upgrading an entire imaging system is an expensive and disruptive process, especially if only upgrading a specific component of the system.

Document Capture

Document capture is the process of converting paper documents into digital images and index data. Images are typically stored as TIFF files on an optical storage system and indexes are stored in a relational database. 

The cost of document capture comes in three different areas:

Labor

Labor is the greatest cost. Estimates are that document capture accounts for up to 80% of the ongoing cost of a production imaging system.

Capital equipment

Capital equipment is the cost of the scanners and the scan stations they are connected to. Prices vary depending upon the scanning speed and features of the systems.  

Integration costs

Integration costs consist of integrating the capture software with the rest of the system. This cost varies with system requirements, and can be quite high if the capture software is restricted to proprietary database and design tools. A well-implemented capture system can reduce the operating costs of an imaging system by 20-40% or more. Outlined below are some of the prime areas for cost reduction.

Scanning

The scanning process converts paper documents into electronic files. The primary cost is the capital cost of the scanners and the ongoing labor cost of operators to run the scanners. Page-by-page scanning is very slow and prevents scanners from running at their rated speed. By implementing batch scanning, the number of scanners and scanner operators required is minimized.

Quality Assurance (QA)

The Quality Assurance process examines documents to make sure they are scanned correctly and are easily readable. The primary cost of this process is the ongoing labor cost of operators to examine images. Cost saving methods include the use of built-in batch integrity checks to insure against improper scanning, and the use of Image processing software that can correct errors, such as skew and orientation.

Rescan

The rescan process sends badly scanned documents back to the scanning stage to be rescanned. Rescan is required for all documents that do not pass the QA inspection. The primary cost of this process is the labor associated with scanner operators who must rescan entire batches of documents. Batches must then be resent to QA and indexing. Cost saving methods include the use of capture software which keeps track of rejected pages and allows the rescan of single pages within a batch. Good design will allow pages to be automatically inserted back into batches in the proper order.

Indexing

Indexing assigns key words to all documents so they can easily be retrieved. This operation is the most critical part of document capture. The primary cost of this process is the labor cost of operators to key in index information. A typical production capture operation employs 2-4 indexers for every scan operator. Cost saving methods include the use of bar codes to automate indexing, and optical character recognition to reduce hand keying. For manual indexing, input screens are designed for efficient indexing to keep up with professional keyboard operators. To ensure accuracy, validation rules are employed on each index field.

Image Management

Image management is the process of exporting images to long term storage and indexes to a permanent database. Image storage is maintained in an optical jukebox and index storage resides in RiskPro. Cost saving methods include the integration of capture software which supports the management of documents to standard optical systems.

Scanning and Import Operations

The basic Scan and Import operations include:

Batch preparation

Before scanning, an operator manually prepares a batch of paper documents, adds document separator sheets if required, counts the number of pages in the batch, and loads the batch into the scanner input tray.

Batch creation

The operator at the scan workstation creates the batch by entering the appropriate information into the system, including the batch name, document class, and various page counts used for error checking. The page counts are used to detect when an incorrect number of pages are scanned and if document separators are not used, to define the beginning and end of each document in the batch.

Scan

To start the scan, the operator clicks the Start button in the capture software. To start an import operation, the operator selects the image filename(s) from a list box and then clicks the Start button. Both scanned and imported images can be contained in the same document or batch. The Scan module displays the scanned and/or imported images in the view window, maintains and displays the document and page counts, and stores the images in a temporary working directory. The Scan module also processes any bar codes and patch codes on the scanned images. Bar codes are used to automatically fill in index fields associated with the document. Patch codes are used on document separator sheets to indicate the beginning of a document.

Batch close

After the entire batch is successfully scanned or imported, the scan operator closes the batch. This automatically sends the batch to the next queue specified in the setup.

Image Cleanup and Optical Character Recognition (OCR)

If a document has well defined fields (for example, a form), it is possible to speed indexing by using OCR to read zones on the document and automatically convert them into indexes. If OCR indexing is specified for a document type, capture software allows you to specify zones in the document and associate each zone with an index field. After the zones have been recognized, the documents can be sent to an indexing station for verification or can be sent straight to the next stage of processing.

The scanning process also supports full text indexing. This process performs OCR on the entire document and produces an ASCII file of the output. The output can also be stored in a variety of word processing formats, including Microsoft Word and WordPerfect.

Image Cleanup

As a rule of thumb, OCR is useful only on clean, sharp images where the OCR accuracy is 95% or higher. If the OCR accuracy is less than 95%, the cost of checking and correcting errors is frequently higher than the cost of manually keying the index data.

There are several techniques that can make images more readable and increase OCR accuracy. The most effective ones include:

Deskewing

This technique straightens pages that have been scanned slightly crooked due to mechanical tolerances in the scanner’s document feeder. Deskewing can increase the accuracy of OCR by 5-10% or more which can make the difference between using expensive manual indexing and automated OCR indexing.

Deshading

OCR engines are unable to read words against the gray shaded backgrounds that are common on forms. Removing shading allows you to OCR zones that are otherwise unreadable.

Despeckling and Streak Removal

These techniques remove small speckles and streaks caused by dirt in the scanner feeder or scanner noise.

Line removal

On typewritten forms, words are frequently typed so that they cross over the lines on the form, which makes them unreadable to OCR. Line removal erases the lines on the image and then reconstructs the characters so they can be recognized.

Edge enhancement

The Edge enhancement function includes a multiple set of filters that sharpen the edges of characters. The results are usually invisible to the eye, but they can increase the accuracy of OCR by as much as 5-10%.

Index/Index Verify

The Index module is used to enter index data and associate it with an imaged document. The index data is then stored in the RiskPro database and can be used at a later time to retrieve the document image from its permanent storage location. Indexing is the most critical and labor intensive step in the document capture process, with typical capture operation sometimes requiring as many as four index stations for each scanner. The index data is the key to retrieving the document. Noted below are several methods used by Blackburn Group, Inc. to reduce operator errors and speed the indexing process:

OCR can be used to fill index fields. This allows the index operator to simply check the accuracy of the OCR field rather than manually typing the required data on the indexing form.

Bar code recognition can be used for indexing documents. Bar codes are processed by the Scan module, and the data is used to fill user-specified fields on the indexing form. Document capture software supports most popular bar code types, including Code 39 and Interleaved 2 of 5.

Custom validation scripts can be configured to fill fields on the indexing form with default values.

QA/Rescan

No scanner is perfect, and rescanning is an integral part of the process. Index operators can easily tag documents or individual pages for rescan, attaching electronic notes that tell the scanner operator exactly what the problem is. The batch is then queued to a rescan workstation where the operator is prompted for the specific pages or documents to be rescanned. Document capture software automatically insert rescanned pages in the appropriate position within the batch.

The following are typical reasons why documents are rejected and the batch is sent to the scan queue for rescanning:

Poorly scanned page (too light, too dark).

Missing page.

Missing document.

Skewed image.

Illegible bar code or patch code.

The operator who detected the problem at the Index or Index Verify queue may attach electronic notes explaining the problem to the rejected document. Before rescanning, the rescan operator can open the Note Viewer via the View menu and read the attached notes.

Image Management

The final stage in the capture process is to transfer each document in the batch either to long term storage or to a workflow system. In the transfer process, the image files are written to permanent storage and the indexes are written to RiskPro or a document manager.

 Please call us for a free consultation from a distributor in your area regarding a specific risk management solution for your business.

Blackburn Group, Inc.
Penfield, NY 14526-0052
(585) 586-4530,  (585) 586-7479 fax,  Email: sales@blackburngroup.com
  

RiskPro is a registered trademark of Blackburn Group, Inc. 

Copyright 2006 Blackburn Group, Inc. All rights reserved.