OCR Software Accuracy Comparison
Looking for increased performance from your OCR software?
The best conventional OCR software products achieve about 98% average accuracy recognizing text on typical quality document images. This level of accuracy sounds pretty good, but still leaves 40 errors remaining on a page of 2000 characters.
Some users, including forms processing operations, chose to correct these errors mounting high OCR error correction labor costs. Others rely on fuzzy search to filter through OCR data hoping to find relevant data only to find that reviewing 10’s, even 100’s of unrelated documents takes time and costs money.
On a typical text page of 2000 characters, conventional OCR, on average, would generate 40 errors. By implementing PrimeOCR’s “Voting” technology, the total errors could be reduced to 8 errors.
Prime Recognition developed PrimeOCR for the production marketplace to reduce the error rate typically found with conventional OCR engines. PrimeOCR licenses and includes engine technology from the best retail OCR vendors. Your image is passed through each engine and using Voting technology, PrimeOCR reduces overall OCR error rate by 65-80%.
PrimeOCR can be configured with level 3 accuracy reducing error rates by 65% or configured to level 6 accuracy which reduces error rates by up to 80%. Level 6 accuracy takes more time to process and is more costly, but depending on your application, may be more cost-effective when compared with the costs associated with error correction or sifting through fuzzy search results.
Real World Results
What does 65% – 85% fewer errors look like when viewing OCR accuracy results on scanned documents? These averages are based on a large number of scanned documents from different document types, various image qualities and varying types of fonts. Some imaging projects may have much cleaner documents so the reduction in error rates will naturally be less, while other projects may include older documents with poor quality characters which PrimeOCR will further decrease the error rate.
Following are the visual results from one page as an illustrative sample. Certainly not statistically conclusive because of the limited number of pages, but it is a good graphical rendition of what errors look like when comparing various levels of accuracy. We processed the same page through an industry OCR engine and then Level 3 accuracy of PrimeOCR and then Level 6 accuracy of PrimeOCR.
Although newer documents may not include so many errors as this single page, what is important to consider is that voting technology can reduce the number of errors by 65% – 85% when compared with traditional OCR software.
Traditional OCR Software
142 errors marked
PrimeOCR L3 Accuracy
21 errors marked
PrimeOCR L6 Accuracy
12 errors marked
See Your Own Results
Want to see a comparison with your own images? Send us a sample and see the difference in the results.
PrimeOCR now supports PDF for high accuracy formatted output. The PDF output generated from PrimeOCR contains fewer errors than conventional OCR engines producing PDF and takes full advantage of PDF’s compression options to produce the smallest PDF file size available from any OCR engine.