I have been asked to bid on a project for a client of mine. They want to start a document management system. I would need to develop an application to store, index and search the database for the information and then display it so it can be printed, emailed, etc..
Before I give my answer to the client, I need to know if anyone has used or know of an OCR ActiveX control that has ZONE OCR.
For those unfamiliar with the term, I only need to OCR certain areas of a page. These areas have the employee's name and Social. I don't need to OCR the entire document since most of the information is hand written. Once I OCR the page I will have the info necessary to enter the data into a table and also rename the scanned file to the employee's name & social.
The process I am leaning towards is to use the Zone OCR to gather the info. Once the OCR is complete, create a new record with the OCR data, rename the PDF from 200808121222.pdf to LastName First 123456789.pdf. This way even if the database is trashed, the PDF's will have the main info in the name of the file and searchable OCR at least on the first page. Of course if there is a duplicate, it would add an ''a', 'b', etc.
If I can't find a Zone OCR ActiveX, the other alternative is to manually enter the employee's data into the table and this cannot be an option. The client has over 200,000 documents with 20 pages each they need scanned. That comes out to over 4 million pages! BTW, outside vendors are running in the 10 cent per image range. Since the paper form is an 11x17 folded in half, it is really only 2 million pages. So the quotes are in the $200,000 - $250,000 range.
If anyone has any ideas for this project, I am all ears. Also, if anyone knows of a reliable, high speed duplex scanner, I would be thankful.
Before I give my answer to the client, I need to know if anyone has used or know of an OCR ActiveX control that has ZONE OCR.
For those unfamiliar with the term, I only need to OCR certain areas of a page. These areas have the employee's name and Social. I don't need to OCR the entire document since most of the information is hand written. Once I OCR the page I will have the info necessary to enter the data into a table and also rename the scanned file to the employee's name & social.
The process I am leaning towards is to use the Zone OCR to gather the info. Once the OCR is complete, create a new record with the OCR data, rename the PDF from 200808121222.pdf to LastName First 123456789.pdf. This way even if the database is trashed, the PDF's will have the main info in the name of the file and searchable OCR at least on the first page. Of course if there is a duplicate, it would add an ''a', 'b', etc.
If I can't find a Zone OCR ActiveX, the other alternative is to manually enter the employee's data into the table and this cannot be an option. The client has over 200,000 documents with 20 pages each they need scanned. That comes out to over 4 million pages! BTW, outside vendors are running in the 10 cent per image range. Since the paper form is an 11x17 folded in half, it is really only 2 million pages. So the quotes are in the $200,000 - $250,000 range.
If anyone has any ideas for this project, I am all ears. Also, if anyone knows of a reliable, high speed duplex scanner, I would be thankful.
Comment