You have chosen to sponsor your bid up to a maximum amount of .
I need a web based app that will accurately OCR scanned pdfs. The uploading of the pdfs will be done through a web interface. The app will poll a directory of pdfs and any new pdfs that gets dropped into the directory will immediately be OCR'ed into a text file. The OCR engine must be able to deal with scans and images.
Once the file is OCR'ed, the app must find a dynamic list of user supplied regex expressions and output the results into a csv file for each pdf.
The polling can be a cron job or daemon, I don't care, but you need to instruct me on how to set it up.
The app can be done in php or rails (preferably rails).
The web interface must use bootstrap or foundation.js
Before I award you the project, I want to see a sample app that can OCR the scanned pdf. Once I am satisfied that your solution can output decent text that matches the scans, I will award you the project 50% down and 50% upon completion and code transfer.