Scan a document and software reads the text only and sorts the desired information and then creates a .csv file of the desired information. This software must also be able to determine duplicates and sort them as one specific item.
I have attached an example document ([url removed, login to view]), the book that we will be scanning is approx 10,000 pages long and most pages have this layout, most other layouts are similiar.
We need the part numbers in one column and the coresponding descriptions in the second column. We have no need for the pictures/other info on the page execept for the heading in this case it is 'original kirby generaton 3, g4, g5, g6, Attachments' the attachment labeled '[url removed, login to view]' shows the basic layout.
It doesn't matter to us how this is down just that the final feed is what we need.
This is the first part of a very large project so if you can complete this part in our satisfaction you will be offered the entire project (if you want).