Where PrimeOCR Is Being Used
Prime Recognition products are being used for various high volume paper to digital conversions.
High accuracy OCR
By using the PrimeOCR Job Server, customers convert thousands of images/day through PrimeOCR to generate high accuracy text results.
Clients use the high accuracy text results to feed into search and retrieval systems to provide the capability to search for keywords within the OCR text and gain access to the original scanned document.
A typical customer may depend on 99.9% accuracy coming out of the OCR software while other customers expect 99.998% accuracy coming out of the OCR software coupled with post-OCR manual verification.
Image conversion to PDF image plus hidden text
PrimeOCR’s ability to generate PDF image plus hidden text output provides a file format where the original scanned image and the OCR text results are in the same file for later search and retrieval. OCR PDF results can be saved with optimization enabled so viewing of large files over the net reduces bandwidth limitations. Color images are also included in the conversion to PDF.
Customers use PrimeOCR to convert to PDF because of the high accuracy character results and the ability of PrimeOCR to OCR the entire page extracting all of the characters on the page. Other products characterize portions of the page as image zones and providing marginal text results during the conversion process.
High volume operations struggle with software reliability impacting their ability to finish projects on time and within budget. Often times software lacks the ability to run for long periods of time without requiring an operator having to monitor its progress and having to restart the software or reboot the PC because the OCR software has corrupted memory. An operator may set up a job to run all night only to find the next morning that the software only converted a small number of images in the batch. PrimeOCR is designed around a fault-tolerant architecture so that if a failure does occur, within the software, the software catches the error, fixes the discrepancy and continues with batch conversion. By providing multiple OCR engines within PrimeOCR, the software is able to still generate OCR results even if one of the engines fail to process any particular image.
When looking to convert 3 million images in 3 months, a customer found that traditional desktop OCR software would crash up to 3 times a day when generating PDF output. Once they tested PrimeOCR they were able to process more images within the same amount of time as they did with the desktop software but without interruption and user intervention.
Accessible OCR PDF Output
While providing high accuracy output, PrimeOCR can include tags into the OCR PDF output that enhances the accessibility of the PDF content.
A large service bureau that services a number of government agencies use PrimeOCR’s accessible PDF output to provide PDF file that meet the customer’s section 508 accessibility standards. Coupled with PrimeView, PrimeOCR can provide the reading order, correct paragraph identification and table identification along with the ability to identify and insert alternate text for images. The searchable PDF output text can even be viewed while in reflow mode making PDF image plus hidden text file content available to all industry PDF readers including mobile platforms.
Several of the industry leaders in resume processing software use PrimeOCR to generate high accuracy results. Some customers use the text results straight from PrimeOCR while others choose to manually verify OCR results with PrimeVerify for maximum accuracy.
One of the largest resume processing facilities leverages PrimeOCR’s increased accuracy by providing recruiting customers the same accuracy of results without having to manually verify each resume. They take the results straight from PrimeOCR and deliver them to the customer passing on the savings of processing large batches of resumes. What used to take days to send offshore to OCR and manually verify can now all be done overnight in a local facility all with PrimeOCR software.
Library Archives/Digital Library
Digital library initiatives are adopting advanced OCR technology like PrimeOCR to convert large book collections for online viewing of content. Not only is PrimeOCR designed to generate accurate results but it can also provide a level of reliability that cannot be found in traditional desktop OCR software.
A large university’s project of converting large collections and providing the content online was improved with PrimeOCR’s unique ability to provide high accuracy results. The results were so impressive that all of the material that had been previously processed was ran through PrimeOCR a second time to improve the ability to find textual information in the collection.
PDF image only conversion
Both PrimeZone and PrimeOCR provide the ability to convert scanned images to the PDF image only file format. PrimeZone converts images a batch at a time while the PrimeOCR Job Server can queue images through PrimeOCR for high-speed PDF image only conversion.
Mandated by federal guidelines to ensure all documentation is provided in PDF format a client converts all scanned documents into PDF image only so any user of their site can view and print documents in their collection. Search is done on ASCII results generated by PrimeOCR but image storage and display is provided with PDF image only files.
An added option of PrimeOCR allows for the software to accurately identify different types of documents. Using high accuracy OCR output, coupled with text parsing technology, PrimeOCR is able to identify and group together different styles of document or forms. In one case, there were over 250 different documents that could be identified by the software. Customers use the identification information to search and retrieval pertinent documents found in the document database.
Medical facilities, including hospitals, generate a large number of different styles of documents when admitting and treating patients. The document collection for an individual is typically stored under the person’s name or a tracking number. Unique attributes of each different style of the document were fed into PrimeOCR to successfully identify the document type of each page within each individual’s document collection. Hospital staff can now electronically retrieve patient’s records, including all accompanying documentation and review pertinent history or lab results from the patient’s record.
On-line retailers use PrimeOCR’s RTF results to retain text format and layout to re-create books that can be marketed as e-books. PrimeOCR’s character accuracy and retention of format allow clients to efficiently reproduce machine printed material into electronic media.
Various clients use PrimeOCR’s high accuracy results to save time and money in generating online content from bound books. Not only does PrimeOCR generate high accuracy character results but it retains excellent formatting which cuts down on the time to format each page of the book for online viewing.
Invoice and shipping receipt processing
Numerous applications demand high accuracy OCR results for reliable operation. Customers use PrimeOCR to extract the invoice or bill of lading number off of the document and rename the scanned document to match the invoice number of convert the image to a different format for viewing on the web. With the invoice or bill of lading number, they are able to quickly perform an electronic search and retrieval of the scanned document improving customer service.
A large shipping company uses PrimeZone to scan barcodes on a signed billing receipt. Once scanned into the system users can view the signed receipt on-line by searching for the shipping reference number. Customer service personnel are able to electronically e-mail the scanned signed receipt within seconds instead of taking days to find a filed hard copy of the receipt.
Another large shipping company OCRs the invoice number from the scanned invoice and has customized PrimeOCR to rename the image file to the invoice number facilitating document storage and search and retrieval.
Customers use PrimeOCR to convert document collections and then sell subscriptions to the digital content. Many customers use PrimeVerify to manually verify OCR results to ensure the highest grade of text results coming out of the conversion process.
Parts catalogs are scanned and converted through PrimeOCR so part information can be viewed online. Instead of manually sorting through microfiche collections, modern electronic means can be used to drastically reduce the time to look and order a part. Both traditional electronic search and clicking through assembly drawings can be used to find a particular part.
Phonebook processing offers several challenges to traditional OCR engines. PrimeZone offers some automated zoning capability that is specifically designed for phonebook pages while PrimeOCR is able to process the required large 600 DPI images to capture very small point size text. The PrimeOCR phonebook module provides further processing of OCR results to further increase the accuracy and formatting of phonebook results.