Prime Recognition’s award-winning production OCR product, PrimeOCR is a Windows OCR engine that reduces OCR error rates by up to 65-80% over conventional OCR by implementing “Voting” OCR technology.
PrimeOCR reduces overall OCR processing costs by reducing the total number of errors generated from OCR and providing a level of reliability not available with other OCR engines.
PrimeOCR produces fewer errors
Today’s best OCR engines are only achieving, on average, 98% accuracy, when recognizing typical quality images. On a typical page of 2000 characters, that means that 40 errors remain in the OCR output.
By using PrimeOCR, error rates can be reduced by 65-80%. This means that the 40 errors generated by today’s OCR engines can be drastically reduced to 8 by using PrimeOCR.
PrimeOCR saves time and money
Although you may pay more for PrimeOCR, the total system and operating costs are much lower by using PrimeOCR.
In some OCR intensive applications, manual error correction cost, including manual labor, and verification workstations, can often be 50-70% of the total image system costs. By reducing the need for error correction, PrimeOCR saves costs associated with annual/project manual error correction labor, capital investment.
PrimeOCR produces cleaner data
Not only does PrimeOCR reduce the total number of errors during OCR, but it also reduces the total number of errors that make it into your database or final application by 75%.
Only 40-60% of errors generated by standard OCR software are “flagged” for correction. Since manual error correction typically only looks at flagged errors, this means that up to 60% of the errors produced by the OCR software are not reviewed and remain in the OCR data. PrimeOCR generates more accurate suspicious character “flags” reducing the total number of errors that remain in the data after processing.
The PrimeOCR Job Server – Production level reliability image processing
The PrimeOCR Job Server provides flexibility and dependability to process a large array of OCR processing options and a level of reliability to process thousands of images without error. Each job defines the images to process, any pre/post OCR processing options required, and the type of output. The OCR Job Server queues the jobs for batch processing, and displays completed job statistics for effective batch management.
All Prime Recognition products are designed for easy installation, simple operation, reliable processing, and are scalable to match your OCR throughput requirements. Within minutes of installation, the PrimeOCR Job Server is generating high accuracy output.
Ever plan to process thousands of images overnight only to find out in the morning that the OCR engine crashed on a poor quality image? PrimeOCR’s automatic engine recovery feature automatically senses when an engine fails and automatically re-initializes it for the next image. This level of software reliability eliminates downtime and mandatory manual intervention during OCR processing. Operating efficiencies are realized by implementing PrimeOCR into your imaging system.
By leveraging PrimeOCR’s features, customers reduce OCR errors from image processing and gain efficiencies in production imaging systems.
Interested in improving OCR accuracy in your imaging system, or having problems with your current system crashing during batch processing? Let us show you how PrimeOCR can impact your conversion operations. Give PrimeOCR a try.
PrimeOCR – fast AND accurate
PrimeOCR can be set up in a mode called “selective voting”. In this mode, PrimeOCR offers the best of both worlds, the high speed of conventional OCR when you can afford it, and the high accuracy of Prime Recognition’s technology when you need it.
PrimeOCR automatically identifies the quality of documents. On clean documents, PrimeOCR will only run one engine, on lower quality documents PrimeOCR will run multiple engines and vote the results. Selective voting is configurable by the user, you decide when to run more engines. This flexibility offers a number of advantages. For example, you may wish to vote less often because you need higher throughput to finish your job for an upcoming deadline, or you may “turn up” voting because this project is for a customer who is more demanding of OCR accuracy.
PrimeOCR can address your throughput requirements while addressing high accuracy OCR needs.
- Options for deskew/image pre-processing such as auto-rotation
- Options to auto-zone, manually zone, or full page OCR
- Options to save image zones
- Support for color and grayscale images
- Priority management
- Includes industry-leading OCR engines
- Reduces errors by up to 65-80%
- Reduces labor costs required for verification
- More accurate flagging of suspicious characters
- High fault tolerance when operating under Windows
- Automatic recovery ensures continuous processing and limits manual intervention
- Common output formats including:
- Formatted ASCII
- RTF – support for color/grayscale images
- PDF – support for color/grayscale images
- PRO (required for verification using PrimeVerify)
- Scalable for growth or increased document capture
- Network architecture provides flexibility, ensures reliability and maximizes operational efficiencies
- OCR Job server manages image files through PrimeOCR processing
- Available through an API/SDK as a Windows DLL
- Template/Job Wizard makes it easy to set up processing files.
- Windows-based workstation or server.
- Windows compatible computer. Up to 4 CPUs/cores supported.
- A hard disk with 50-150 megabytes (Meg) of space for installation
- At least 2GB of Random Access Memory (RAM), 4GB megabytes recommended. Additional memory may be required for processing color/grayscale or higher resolution images.
Recognition Data Types
PrimeOCR recognizes the following data types:
Characters – Machine Print and Dot Matrix text in any of the following 11 languages:
plus Russian, Chinese Simple, Chinese Traditional, Japanese and Korean characters.
Optical Marks (OMR) – When an area on the image is “zoned” as OMR, PrimeOCR will return the percentage of black space contained within the zone. This percentage can be used to determine whether a user has marked a selection on the page.
Graphics – PrimeOCR normally ignores any graphics (e.g., pictures) found on an image. It can instead be instructed to save the graphic to a file. A path to the graphic is added to the text output for later page reconstruction.
PrimeOCR Access Methods
PrimeOCR can be accessed through:
The PrimeOCR Job Server – The Job Server controls PrimeOCR processing and can instruct PrimeOCR to process all images found in a directory/subdirectories, with no user intervention or coding. It also records major activities performed by PrimeOCR.
PrimeView/PrimeVerify – This graphical interface for end-users consists of two applications for sending images to the Job Server and editing PrimeOCR results. See the PrimeVerify Data Sheet for more information.
Software Developers Kit (SDK) – The SDK consists of 32 simple, orthogonal API calls accessible as a Dynamic Link Library by the following languages:
- C or C++
- Visual Basic
- VB NET
- Any language capable of accessing a DLL such as PowerBuilder.
The SDK also includes:
- Complete documentation
- Working source code examples
PrimeOCR will read images from either file or memory in the following formats:
- TIFF (single or multi-page) all compression types
- Color or grayscale
Valid resolutions include 200, 240, 300, 400, and 600 DPI as well as Standard or Fine FAX.
PrimeOCR offers a variety of ways to enhance and define your image for optimal OCR results:
Improves image quality for better OCR using features such as:
- Manual Zoning
- Auto Zoning
- Zone Content Restrictions include None, Alphabetic, Alphabetic Upper/Lower Case, Numeric, Graphic, and OMR.
PrimeOCR has several features that improve OCR accuracy, fault tolerance, and speed:
The base PrimeOCR configuration achieves 65% fewer errors than conventional OCR using a “3 engine” voting configuration. Even greater accuracy can be achieved through the following:
- 4 or 5 or 6 Engines – Add a 4th OCR engine to the base configuration for 75% fewer errors, a 5th engine for 80% fewer errors or a 6th engine for 83% fewer errors.
- Character Training – PrimeOCR can be trained to recognize specific character sets or fonts.
- Engine Customization – Users may select which engines participate in the recognition process or even weigh engine results differently.
High Fault Tolerance
Automatic Engine Recover – A poor quality image can cause a conventional OCR product to “crash”. To solve this problem, PrimeOCR can sense when an engine fails and automatically reinitialize it for the next image. This increases throughput by allowing PrimeOCR to run unattended, 24 hours a day!
- Multi-Processor Support – This option allows PrimeOCR to utilize up to four processors in a multi-CPU system for faster throughput.
- Selective Voting – While “Voting” takes longer than conventional OCR, you can speed up the processing on high-quality images through Selective Voting. The result: faster OCR speeds on high-quality documents and more processing power on lower quality documents.
PrimeOCR can generate file output in the following formats:
- ASCII – Text-only output, left justified.
- Formatted ASCII – Spaces are added to text to mimic the original imaging layout.
- PDF – Converts scanned images into PDF “Normal”, “Image + Text” or “Image Only” formatted file including color images, including accessible PDF output and PDF/a.
- RTF – Retains original character attributes and page layout using frames and paragraph conventions. Color/grayscale image zones are supported.
- Comma Delimited ASCII – Useful for exporting text fields to other applications.
- Confidence/Character Attribute Reporting – Provides text and information on each character to aid in OCR verification. Attributes include line coordinates as well as character confidence, font, location, point size, style, etc.
- HTML – Transfer OCR results directly to the Web for online viewing. Color/grayscale image zones are supported.
- Tab-delimited – useful for forms-based applications, each defined zone’s output is separated by a comma which can be easily imported into any popular database application.
- RRI3 – RRI’s FormWorks compatible format.
- ZYINDEX – ZyLab’s ZyIndex compatible format.
- Custom output – each conversion project is unique in its requirements. Contact us if you need customized output including advanced parsing of text output or any other custom pre or post OCR processing.
Complement products for PrimeOCR
Prime Recognition has 2 add-on applications that can be customized for specific image document types to improve PrimeOCR accuracy rates:
- PrimeZone – This custom pre-OCR auto-zoning application creates a zone template for each image based upon specific document types such as Phonebooks, Greenbar, etc.
- PrimePost – This custom post-OCR utility performs automatic error correction based upon predefined document types.
Prime Recognition’s products are designed for the production market, hence they are significantly more expensive than desktop OCR products, but are competitively priced for the production market. Most configurations will have a cost between $4,600 to $8,000 per PC. PrimeOCR’s increased accuracy, fault tolerance, and other “high end” features pay for themselves very quickly. Most of our clients report very short payback periods.
The “base configuration” of PrimeOCR starts at $4,600.00 for a license to process unlimited pages on one PC, or $1,400.00 for a page limited license on one PC (limited to process 150,000 pages then expire).
The base configuration of PrimeOCR includes Level 3 accuracy, the PrimeOCR Job Server application, the SDK, a sample REST API implementation, 2000 zones/page, all file outputs except for PDF (e.g. ASCII, RTF, XML, HTML, PRO, etc.), auto-zoning and recognition of English and 10 Western European languages.
Available add-on modules, that would add to the price of the base configuration, include:
- additional voting levels for increased recognition accuracy (Level 6 maximum)
- support for processing color or grayscale images
- image enhancement (deskew, auto-rotation, despeckle, etc.)
- PDF input/output
- support for multi-core CPUs
- support for Capture/InputAccel
- barcode recognition
- recognition of Asian languages (Chinese, Japanese and Korean)
- recognition of Russian language
- recognition of rotated text (90, 180, 270 degrees)
- support for large images up to E sized Engineering drawings
Annual software maintenance program is required for the first year, is 15% of the license cost, and is already included in the pricing examples provided above.
Prime Recognition also offers a variety of pricing options to match your financial needs.
We encourage you to contact us at firstname.lastname@example.org or give us a call at (425)895-0550 to let us tailor a pricing program to your needs.
- See how PrimeOCR can be cost effective for your OCR processing operation.
- See how PrimeOCR provides cleaner data by reducing OCR errors.
- Prime Recognition products are designed for the production imaging market with features that provide powerful, scalable OCR solutions. See how they can easily integrate into your current OCR processing flow.
- See how high accuracy OCR software can save you operational costs.