PrimeOCR Job ServerOverview The PrimeOCR Job Server is designed to read "jobs" from user specified directories, configure and start the PrimeOCR engine processing, and then place the PrimeOCR results in user specified output directories.
|
|
Job Creation Jobs are short simple ASCII files that describe the images that need to be processed and any configuration information for OCR processing. Job files can be created by selecting images and selecting OCR processing options with PrimeView, by a user application, or manually. Each job can define a single image (e.g. "c:\ocr\first_batch\001.tif"), or thousands of images (e.g. c:\ocr\*.tif+"). With the simple addition of a plus sign "+", and standard wild card matching, a directory and subdirectories of images can be processed. Job queue The Job Server queues multiple jobs while others can be added at any time by simply placing them into the queue directories. High priority job directories can be created to ensure timely processing of top priority jobs. Designed for the network The PrimeOCR Job Server can be located on the same PC as the application generating the job files, or anywhere on a LAN/WAN network. Since the Job Server's architecture is file based it is not dependent on any specific network protocols. This means that many PrimeView stations, or multiple user applications, can be submitting jobs to the Job Server over the network. Editing of output can be done as soon as the output is created so operators do not have to wait until the batch is complete before verifying OCR results. Simultaneous setup, processing and editing of images can be completed with products that are designed to take advantage of a network of resources. Scalability is built into the PrimeOCR Job Server. If OCR output requirements increase, just simply add more Job Servers onto the network. The PrimeOCR Job Server can also take advantage of multiple CPU workstations. Rich Features with Reliability Every job defines various configuration settings for the PrimeOCR engine. One job might require auto-zoning of the images and save to RTF, while the next job might require complex pre-processing of the image, such as deskew, despeckle, and auto-rotation, prior to full text zoning, and saving the output to PDF Normal. The PrimeOCR Job Server's flexibility to change configuration options on the fly, without manual intervention or downtime allows for continuous processing of images. The PrimeOCR engine has several fault tolerant features plus the Job Server has several additional features, including automatically handling of "exceptions." Some low quality images cause OCR engines to "GPF', or hang indefinitely. The Job Server is able to manage these exceptions, continue processing the image with other OCR engines that are available, restart the engine that "GPFed" and continue processing the remaining images. The PrimeOCR Job Server provides world class levels of flexibility and reliability that minimize manual intervention and improves efficiencies of production OCR processing operations. Job Statistics Several reports are generated during processing. For each image, statistics regarding average confidence of OCR results, total words, and characters marked are recorded in a text file. The report is useful for users that choose to either verify results of low quality images or manually key in the image data. For each job, the number of files processed, along with any associated errors is reported to the screen as well as recorded in a text file. Completed job status is accessible to anyone on the network. Several levels of job progress and error logging are available through the Job Server. |